CUHK-ARISE / ml4code-dataset
A collection of datasets for machine learning for big code
☆54Updated 3 years ago
Alternatives and similar repositories for ml4code-dataset:
Users that are interested in ml4code-dataset are comparing it to the libraries listed below
- Learning graph-based code representations for source-level functional similarity detection. ICSE'23☆50Updated 2 years ago
- ☆41Updated 2 years ago
- This repository is to support contributions for tools and new data entries for the D2A dataset hosted in DAX☆69Updated 2 years ago
- Code and data for paper "Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree".☆64Updated 3 years ago
- Code of our paper Applying CodeBERT for Automated Program Repair of Java Simple Bugs which is accepted to MSR 2021.☆52Updated 2 years ago
- Replication package for "Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability Detection", ICSE 2024.☆60Updated 6 months ago
- Code and dataset for paper C4: Contrastive Cross-Language Code Clone Detection☆30Updated 2 years ago
- ☆31Updated last year
- For our ISSTA23 paper "How Effective are Neural Networks for Fixing Security Vulnerabilities?" by Yi Wu, Nan Jiang, Hung Viet Pham, Thiba…☆35Updated last year
- Vul4J: A Dataset of Reproducible Java Vulnerabilities☆79Updated last month
- ☆33Updated 2 years ago
- open science repo of "Neural Transfer Learning for Repairing Security Vulnerabilities in C Code" https://arxiv.org/pdf/2104.08308☆61Updated last year
- VulRepair: A T5-Based Automated Software Vulnerability Repair☆75Updated 3 weeks ago
- [ICSE 2021] - InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees☆90Updated 3 years ago
- Replication package of a paper "Large Language Models are Few-shot Testers: Exploring LLM-based General Bug Reproduction"☆22Updated last year
- ☆49Updated 2 years ago
- ☆112Updated 2 years ago
- ☆17Updated last year
- RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair http://arxiv.org/pdf/2312.15698☆32Updated last week
- For our ISSTA20 paper "CoCoNuT: Combining Context-Aware Neural Translation Models using Ensemble for Program Repair" by Thibaud Lutellier…☆61Updated 2 years ago
- ☆14Updated 3 years ago
- ☆30Updated 3 years ago
- ☆25Updated 2 years ago
- ☠️ Ground-truth dataset for vulnerability prediction (known research datasets and data sources included such as NVD, CVE Details and OSV)…☆88Updated last year
- ☆60Updated last year
- Replication Package for "Compressing Pre-trained Models of Code into 3 MB", ASE 2022☆28Updated 5 months ago
- Statement-level deep learning model for automated software vulnerability detection in C/C++ (Accepted in MSR 2022)☆71Updated 2 years ago
- MegaVul - The largest, high-quality, extensible, continuously updated, C/C++/Java vulnerability dataset☆64Updated 2 months ago
- Deep learning code semantic similarity☆62Updated 5 years ago
- ☆25Updated 3 years ago