CUHK-ARISE / ml4code-datasetLinks
A collection of datasets for machine learning for big code
☆62Updated 4 years ago
Alternatives and similar repositories for ml4code-dataset
Users that are interested in ml4code-dataset are comparing it to the libraries listed below
Sorting:
- This repository is to support contributions for tools and new data entries for the D2A dataset hosted in DAX☆74Updated 3 years ago
- VulRepair: A T5-Based Automated Software Vulnerability Repair☆83Updated 7 months ago
- ☆34Updated last year
- Vul4J: A Dataset of Reproducible Java Vulnerabilities☆114Updated 3 months ago
- ☆122Updated 3 years ago
- [TOSEM 2023] A Survey of Learning-based Automated Program Repair☆73Updated last year
- Code of our paper Applying CodeBERT for Automated Program Repair of Java Simple Bugs which is accepted to MSR 2021.☆52Updated 3 years ago
- open science repo of "Neural Transfer Learning for Repairing Security Vulnerabilities in C Code" https://arxiv.org/pdf/2104.08308☆62Updated last year
- For our ISSTA23 paper "How Effective are Neural Networks for Fixing Security Vulnerabilities?" by Yi Wu, Nan Jiang, Hung Viet Pham, Thiba…☆41Updated 2 years ago
- [ICSE 2021] - InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees☆89Updated 4 months ago
- Replication package for "Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability Detection", ICSE 2024.☆68Updated last year
- BugsInPy: Benchmarking Bugs in Python Projects☆121Updated last week
- Repository for PrimeVul Vulnerability Detection Dataset☆206Updated last year
- A multi-lingual program repair benchmark set based on the Quixey Challenge☆131Updated 3 years ago
- ☆40Updated 2 years ago
- NaturalCC: An Open-Source Toolkit for Code Intelligence☆311Updated 3 months ago
- Statement-level deep learning model for automated software vulnerability detection in C/C++ (Accepted in MSR 2022)☆74Updated 3 years ago
- For our ICSE23 paper "Impact of Code Language Models on Automated Program Repair" by Nan Jiang, Kevin Liu, Thibaud Lutellier, and Lin Tan☆62Updated last year
- ☠️ Ground-truth dataset for vulnerability prediction (known research datasets and data sources included such as NVD, CVE Details and OSV)…☆101Updated 2 years ago
- Large Language Models for Software Engineering☆256Updated 5 months ago
- TeCo: an ML+Execution model for test completion☆32Updated last year
- ☆32Updated 3 years ago
- A C/C++ Code Vulnerability Dataset with Code Changes and CVE Summaries☆350Updated 4 years ago
- ☆16Updated 2 years ago
- Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks☆252Updated last year
- ☆20Updated last year
- For our ICSE21 paper "CURE: Code-Aware Neural Machine Translation for Automatic Program Repair" by Nan Jiang, Thibaud Lutellier, and Lin …☆56Updated 3 years ago
- Code and dataset for paper C4: Contrastive Cross-Language Code Clone Detection☆31Updated 3 years ago
- ☆30Updated 11 months ago
- ☆26Updated 3 years ago