saltudelft / ml4se
A curated list of papers, theses, datasets, and tools related to the application of Machine Learning for Software Engineering
☆686Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for ml4se
- Large Language Models for Software Engineering☆191Updated this week
- NaturalCC: An Open-Source Toolkit for Code Intelligence☆275Updated this week
- methods2test is a supervised dataset consisting of Test Cases and their corresponding Focal Methods from a set of Java software repositor…☆134Updated 11 months ago
- A C/C++ Code Vulnerability Dataset with Code Changes and CVE Summaries☆242Updated 3 years ago
- Benchmark ClassEval for class-level code generation.☆126Updated 3 weeks ago
- Repository for PrimeVul Vulnerability Detection Dataset☆76Updated 2 months ago
- For our ICSE23 paper "Impact of Code Language Models on Automated Program Repair" by Nan Jiang, Kevin Liu, Thibaud Lutellier, and Lin Tan☆56Updated last month
- A library for mining of path-based representations of code (and more)☆282Updated 11 months ago
- ☆173Updated this week
- Repo-Level Code generation papers☆94Updated 5 months ago
- Website for "A Survey of Machine Learning for Big Code and Naturalness"☆291Updated 3 months ago
- [TOSEM 2023] A Survey of Learning-based Automated Program Repair☆68Updated 6 months ago
- Pip compatible CodeBLEU metric implementation available for linux/macos/win☆64Updated this week
- ☆229Updated 9 months ago
- ☆117Updated last year
- Repository for the paper "Large Language Model-Based Agents for Software Engineering: A Survey". Keep updating.☆321Updated this week
- ☆186Updated 3 months ago
- Code and data for XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence☆66Updated last year
- ☆56Updated 11 months ago
- ☆96Updated 4 months ago
- CodeBERTScore: an automatic metric for code generation, based on BERTScore☆168Updated 8 months ago
- A multi-lingual program repair benchmark set based on the Quixey Challenge☆102Updated 2 years ago
- Artifact repository for the paper "Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code", In P…☆40Updated 5 months ago
- Extract and combine multiple source code views using tree-sitter☆108Updated 4 months ago
- [ICSE 2024 Industry Challenge Track] Official implementation of "ReposVul: A Repository-Level High-Quality Vulnerability Dataset".☆45Updated last month
- Vul4J: A Dataset of Reproducible Java Vulnerabilities☆67Updated 2 months ago
- Implementation of the paper "Language-agnostic representation learning of source code from structure and context".☆167Updated 2 years ago
- Code and dataset for paper C4: Contrastive Cross-Language Code Clone Detection☆25Updated 2 years ago
- CodeXGLUE☆1,560Updated 6 months ago
- ☆23Updated 6 months ago