CUHK-ARISE / ml4code-datasetLinks
A collection of datasets for machine learning for big code
☆61Updated 3 years ago
Alternatives and similar repositories for ml4code-dataset
Users that are interested in ml4code-dataset are comparing it to the libraries listed below
Sorting:
- A multi-lingual program repair benchmark set based on the Quixey Challenge☆125Updated 3 years ago
- Vul4J: A Dataset of Reproducible Java Vulnerabilities☆99Updated 2 weeks ago
- VulRepair: A T5-Based Automated Software Vulnerability Repair☆80Updated 4 months ago
- For our ICSE21 paper "CURE: Code-Aware Neural Machine Translation for Automatic Program Repair" by Nan Jiang, Thibaud Lutellier, and Lin …☆55Updated 2 years ago
- ☆117Updated 2 years ago
- BugsInPy: Benchmarking Bugs in Python Projects☆110Updated last year
- Code of our paper Applying CodeBERT for Automated Program Repair of Java Simple Bugs which is accepted to MSR 2021.☆53Updated 2 years ago
- For our ISSTA20 paper "CoCoNuT: Combining Context-Aware Neural Translation Models using Ensemble for Program Repair" by Thibaud Lutellier…☆62Updated 2 years ago
- This repository is to support contributions for tools and new data entries for the D2A dataset hosted in DAX☆74Updated 3 years ago
- ☠️ Ground-truth dataset for vulnerability prediction (known research datasets and data sources included such as NVD, CVE Details and OSV)…☆96Updated 2 years ago
- [ICSE 2021] - InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees☆89Updated last month
- [TOSEM 2023] A Survey of Learning-based Automated Program Repair☆69Updated last year
- RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair http://arxiv.org/pdf/2312.15698☆35Updated 2 weeks ago
- Learning graph-based code representations for source-level functional similarity detection. ICSE'23☆53Updated 2 years ago
- Refactory: Re-factoring based Program Repair applied to Programming Assignments☆40Updated 3 years ago
- open science repo of "Neural Transfer Learning for Repairing Security Vulnerabilities in C Code" https://arxiv.org/pdf/2104.08308☆63Updated last year
- methods2test is a supervised dataset consisting of Test Cases and their corresponding Focal Methods from a set of Java software repositor…☆163Updated last year
- NaturalCC: An Open-Source Toolkit for Code Intelligence☆304Updated this week
- This repository is the replication package of the ICSE22 paper "FIRA: Fine-Grained Graph-Based Code Change Representation for Automated C…☆32Updated 3 years ago
- ☆61Updated last year
- Repository for PrimeVul Vulnerability Detection Dataset☆178Updated last year
- ☆39Updated 2 years ago
- Large Language Models for Software Engineering☆245Updated last month
- ☆34Updated last year
- For our ISSTA23 paper "How Effective are Neural Networks for Fixing Security Vulnerabilities?" by Yi Wu, Nan Jiang, Hung Viet Pham, Thiba…☆39Updated last year
- ☆49Updated 2 years ago
- For our ICSE23 paper "Impact of Code Language Models on Automated Program Repair" by Nan Jiang, Kevin Liu, Thibaud Lutellier, and Lin Tan☆63Updated 11 months ago
- ☆14Updated 2 years ago
- ☆18Updated last year
- Replication Package for "Compressing Pre-trained Models of Code into 3 MB", ASE 2022☆30Updated 11 months ago