CUHK-ARISE / ml4code-dataset
A collection of datasets for machine learning for big code
☆56Updated 3 years ago
Alternatives and similar repositories for ml4code-dataset
Users that are interested in ml4code-dataset are comparing it to the libraries listed below
Sorting:
- This repository is to support contributions for tools and new data entries for the D2A dataset hosted in DAX☆71Updated 2 years ago
- ☆41Updated 2 years ago
- Replication package for "Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability Detection", ICSE 2024.☆63Updated 7 months ago
- ☆49Updated 2 years ago
- Learning graph-based code representations for source-level functional similarity detection. ICSE'23☆49Updated 2 years ago
- ☆112Updated 2 years ago
- [ICSE 2021] - InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees☆90Updated 3 years ago
- ☆33Updated last year
- ☠️ Ground-truth dataset for vulnerability prediction (known research datasets and data sources included such as NVD, CVE Details and OSV)…☆93Updated last year
- open science repo of "Neural Transfer Learning for Repairing Security Vulnerabilities in C Code" https://arxiv.org/pdf/2104.08308☆63Updated last year
- Code and data for paper "Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree".☆64Updated 3 years ago
- ☆29Updated 3 years ago
- ☆14Updated 3 years ago
- Vul4J: A Dataset of Reproducible Java Vulnerabilities☆85Updated 2 months ago
- SeqTrans: Automatic Vulnerability Fix via Sequence to Sequence Learning☆17Updated 3 years ago
- An Extensible Java Bug Benchmark for Automatic Program Repair Studies☆34Updated last year
- ☆205Updated 9 months ago
- Statement-level deep learning model for automated software vulnerability detection in C/C++ (Accepted in MSR 2022)☆72Updated 2 years ago
- ☆45Updated 2 years ago
- MegaVul - The largest, high-quality, extensible, continuously updated, C/C++/Java vulnerability dataset☆73Updated 4 months ago
- Code of our paper Applying CodeBERT for Automated Program Repair of Java Simple Bugs which is accepted to MSR 2021.☆52Updated 2 years ago
- For our ISSTA20 paper "CoCoNuT: Combining Context-Aware Neural Translation Models using Ensemble for Program Repair" by Thibaud Lutellier…☆62Updated 2 years ago
- ☆61Updated last year
- Bugs.jar: A Large-scale, Diverse Dataset of Bugs for Java Program Repair☆56Updated 7 years ago
- For our ISSTA23 paper "How Effective are Neural Networks for Fixing Security Vulnerabilities?" by Yi Wu, Nan Jiang, Hung Viet Pham, Thiba…☆36Updated last year
- Code and dataset for paper C4: Contrastive Cross-Language Code Clone Detection☆30Updated 2 years ago
- Replication Package for "Compressing Pre-trained Models of Code into 3 MB", ASE 2022☆29Updated 7 months ago
- For our ICSE23 paper "Impact of Code Language Models on Automated Program Repair" by Nan Jiang, Kevin Liu, Thibaud Lutellier, and Lin Tan☆60Updated 7 months ago
- ☆83Updated 4 years ago
- This repo illustrates how to evaluate the artifacts in the paper Deep Just-in-Time Defect Prediction: How Far Are We? published in ISSTA'…☆37Updated last year