microsoft / CodeBERTLinks
CodeBERT
☆2,596Updated 2 years ago
Alternatives and similar repositories for CodeBERT
Users that are interested in CodeBERT are comparing it to the libraries listed below
Sorting:
- CodeXGLUE☆1,722Updated last year
- Home of CodeT5: Open Code LLMs for Code Understanding and Generation☆3,039Updated last year
- Code for the paper "Evaluating Large Language Models Trained on Code"☆2,871Updated 6 months ago
- Datasets, tools, and benchmarks for representation learning of code.☆2,353Updated 3 years ago
- Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from d…☆760Updated last year
- TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"☆1,138Updated last year
- Code for the model presented in the paper: "code2seq: Generating Sequences from Structured Representations of Code"☆561Updated last month
- A curated list of papers, theses, datasets, and tools related to the application of Machine Learning for Software Engineering☆714Updated last year
- NaturalCC: An Open-Source Toolkit for Code Intelligence☆304Updated last week
- ☆235Updated last year
- Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].☆187Updated 3 years ago
- Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024☆1,543Updated last month
- APPS: Automated Programming Progress Standard (NeurIPS 2021)☆479Updated last year
- 👨💻 An awesome and curated list of best code-LLM for research.☆1,225Updated 8 months ago
- Implementation of the paper "Language-agnostic representation learning of source code from structure and context".☆169Updated 3 years ago
- methods2test is a supervised dataset consisting of Test Cases and their corresponding Focal Methods from a set of Java software repositor…☆160Updated last year
- [TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.☆2,786Updated 3 weeks ago
- ☆24Updated 3 years ago
- GPTCloneBench is a clone detection benchmark based on SemanticCloneBench and GPT.☆12Updated 6 months ago
- This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (Neur…☆541Updated 6 months ago
- A Database of Real Faults and an Experimental Infrastructure to Enable Controlled Experiments in Software Engineering Research☆857Updated 2 months ago
- CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.☆5,119Updated 6 months ago
- A multi-lingual program repair benchmark set based on the Quixey Challenge☆125Updated 2 years ago
- Generative model for code infilling and synthesis☆304Updated last year
- ☆25Updated last year
- A library for mining of path-based representations of code (and more)☆293Updated last year
- A framework for the evaluation of autoregressive code generation language models.☆973Updated 3 weeks ago
- Large Language Models for Software Engineering☆241Updated 2 weeks ago
- predicting the buggy source files from the bug reports.☆28Updated 2 years ago
- This repository is to support contributions for tools for the Project CodeNet dataset hosted in DAX☆1,609Updated 2 weeks ago