facebookresearch / mmdLinks

ML models often mispredict, and it is hard to tell when and why. We present a data mining based approach to discover whether there is a certain form of data that particular causes the model to mispredict.

☆18

Alternatives and similar repositories for mmd

Users that are interested in mmd are comparing it to the libraries listed below

Sorting:

ALFA-group / adversarial-code-generation
[ICLR 2021] "Generating Adversarial Computer Programs using Optimized Obfuscations" by Shashank Srikant, Sijia Liu, Tamara Mitrovska, Shi…
☆30Updated 3 years ago
google-research-datasets / great
The dataset for the variable-misuse task, used in the ICLR 2020 paper 'Global Relational Models of Source Code' [https://openreview.net/f…
☆22Updated 4 years ago
terryyz / DataAug4Code
Source Code Data Augmentation for Deep Learning: A Survey.
☆65Updated last year
modit-team / MODIT
MODIT: On Multi-Modal Learning of Editing Source Code.
☆20Updated 4 years ago
zwzhang44 / DietCode
☆20Updated 2 years ago
dashends / CodeSyntax
Code and dataset for EMNLP 2022 Findings paper "Benchmarking Language Models for Code Syntax Understanding"
☆14Updated 2 years ago
rizwan09 / REDCODER
☆45Updated 2 weeks ago
saikat107 / NatGen
☆41Updated 2 years ago
giganticode / jemma
JEMMA: An Extensible Java dataset for Many ML4Code Applications
☆19Updated 2 years ago
reddy-lab-code-research / CodeAttack
Code for the AAAI 2023 paper "CodeAttack: Code-based Adversarial Attacks for Pre-Trained Programming Language Models
☆31Updated 2 years ago
amazon-science / recode
Releasing code for "ReCode: Robustness Evaluation of Code Generation Models"
☆52Updated last year
panthap2 / deep-jit-inconsistency-detection
Deep Just-In-Time Inconsistency Detection Between Comments and Source Code: Artifact
☆22Updated 2 years ago
RosaliaTufano / code_review
☆36Updated 3 years ago
EngineeringSoftware / CoditT5
CoditT5: Pretraining for Source Code and Natural Language Editing
☆28Updated 5 months ago
justinphan3110 / CoTexT
Code implementation for CoTexT: Multi-task Learning with Code-Text Transformer
☆36Updated 3 years ago
lt-asset / REPOCOD
For our ACL25 Paper: Can Language Models Replace Programmers? RepoCod Says ‘Not Yet’ - by Shanchao Liang and Yiran Hu and Nan Jiang and L…
☆19Updated 2 weeks ago
beyondacm / Que2Code
Code Snippet Recommendation from Stack Overflow Post
☆18Updated 4 years ago
squaresLab / VarCLR
VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning
☆39Updated 2 years ago
microsoft / msrc-dpu-learning-to-represent-edits
C# Data Extraction for "Learning to Represent Edits"
☆26Updated 6 years ago
yuewang-cuhk / awesome-programming-language-pretraining-papers
Recent Advances in Programming Language Pre-Trained Models (PL-PTMs)
☆58Updated 3 years ago
wasiahmad / AVATAR
Official code of our work, AVATAR: A Parallel Corpus for Java-Python Program Translation.
☆54Updated 11 months ago
nokia / codesearch
Models and datasets for annotated code search.
☆35Updated 2 years ago
tech-srl / c3po
Code for the paper "A Structural Model for Contextual Code Changes"
☆32Updated last year
DeepSoftwareAnalytics / CommitMsgEmpirical
☆28Updated 2 years ago
Jun-jie-Huang / CoCLR
Source Code for ACL-21 main conference paper "CoSQA: 20,000+ Web Queries for Code Search and Question Answering".
☆45Updated 2 years ago
sola-st / IdBench
A benchmark for evaluating embeddings of identifiers in source code.
☆22Updated 3 years ago
google-research / runtime-error-prediction
This is the repository for the paper Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descripti…
☆25Updated 2 years ago
mdrafiqulrabin / SIVAND
ESEC/FSE'21: Prediction-Preserving Program Simplification
☆10Updated 2 years ago
zysszy / TreeGen-Pytorch
☆18Updated 2 years ago
Alex-HaochenLi / RACS
[EMNLP'22] Code for 'Exploring Representation-level Augmentation for Code Search'
☆27Updated last year