google-research/plur

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/google-research/plur)

google-research / plur

PLUR (Programming-Language Understanding and Repair) is a collection of source code datasets suitable for graph-based machine learning. We provide scripts for downloading, processing, and loading the datasets. This is done by offering a unified API and data structures for all datasets.

☆90

Alternatives and similar repositories for plur

Users that are interested in plur are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

google-research-datasets / great
View on GitHub
The dataset for the variable-misuse task, used in the ICLR 2020 paper 'Global Relational Models of Source Code' [https://openreview.net/f…
☆22Aug 19, 2020Updated 5 years ago
cedricrupb / TSSB3M
View on GitHub
Mining tool and large-scale datasets of single statement bug fixes in Python
☆19Nov 29, 2023Updated 2 years ago
VHellendoorn / ICLR20-Great
View on GitHub
Data and Code for Reproducing "Global Relational Models of Source Code"
☆87May 10, 2021Updated 5 years ago
google-research / runtime-error-prediction
View on GitHub
This is the repository for the paper Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descripti…
☆25Nov 18, 2022Updated 3 years ago
parasj / contracode
View on GitHub
Contrastive Code Representation Learning: functionality-based JavaScript embeddings through self-supervised learning
☆172Dec 26, 2021Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
saltudelft / CD4Py
View on GitHub
CD4Py: Code De-Duplication for Python
☆23Dec 13, 2020Updated 5 years ago
bayesgroup / code_transformers
View on GitHub
Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Sourc…
☆67Dec 3, 2021Updated 4 years ago
wasiahmad / PLBART
View on GitHub
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].
☆186Mar 1, 2022Updated 4 years ago
CarperAI / CodeReviewSE
View on GitHub
Stuff related to scraping the Code Review StackExchange
☆11Jan 19, 2023Updated 3 years ago
mlapistudy / ICSE2021_421
View on GitHub
This is the artifact for paper “Are Machine Learning Cloud APIs Used Correctly? (#421)” in ICSE2021
☆16Feb 27, 2021Updated 5 years ago
modit-team / MODIT
View on GitHub
MODIT: On Multi-Modal Learning of Editing Source Code.
☆20Apr 24, 2021Updated 5 years ago
microsoft / neurips21-self-supervised-bug-detection-and-repair
View on GitHub
Replication Code for "Self-Supervised Bug Detection and Repair" NeurIPS 2021
☆111Aug 30, 2022Updated 3 years ago
danielzuegner / code-transformer
View on GitHub
Implementation of the paper "Language-agnostic representation learning of source code from structure and context".
☆174Apr 6, 2022Updated 4 years ago
ml4code / ml4code.github.io
View on GitHub
Website for "A Survey of Machine Learning for Big Code and Naturalness"
☆295Feb 7, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
acr31 / features-javac
View on GitHub
A javac plugin for extracting a feature graph for plugging in to machine learning models
☆28Jan 20, 2021Updated 5 years ago
google-research-datasets / eth_py150_open
View on GitHub
A redistributable subset of the ETH Py150 corpus [https://www.sri.inf.ethz.ch/py150], introduced in the ICML 2020 paper 'Learning and Eva…
☆32Aug 11, 2020Updated 5 years ago
microsoft / dpu-utils
View on GitHub
Utilities used by the Deep Program Understanding team
☆104Jun 12, 2023Updated 3 years ago
dpfried / incoder
View on GitHub
Generative model for code infilling and synthesis
☆312Sep 9, 2023Updated 2 years ago
zkcpku / HGT-HPG
View on GitHub
code for "Learning to Represent Programs with Heterogeneous Graphs"
☆12May 17, 2022Updated 4 years ago
neulab / incremental_tree_edit
View on GitHub
Code for "Learning Structural Edits via Incremental Tree Transformations" (ICLR'21)
☆41Jun 20, 2021Updated 5 years ago
shrivastavadisha / N-PEPS
View on GitHub
☆18Jan 3, 2022Updated 4 years ago
pdlan / OSCAR
View on GitHub
Code for ICML 2021 paper: How could Neural Networks understand Programs?
☆123Nov 7, 2024Updated last year
jjhenkel / code-vectors-artifact
View on GitHub
Artifacts and other data for "Code Vectors: Understanding Programs Through Embedded Abstraced Symbolic Traces"
☆22Jun 5, 2020Updated 6 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
breandan / cstk
View on GitHub
🔍 Code Search Tools & Experiments
☆13Jun 4, 2026Updated last month
CGCL-codes / naturalcc
View on GitHub
NaturalCC: An Open-Source Toolkit for Code Intelligence
☆317Updated this week
jordiae / neural-compilers
View on GitHub
☆21Jul 11, 2022Updated 4 years ago
agemagician / CodeTrans
View on GitHub
Pretrained Language Models for Source code
☆258Jun 1, 2021Updated 5 years ago
microsoft / ptgnn
View on GitHub
A PyTorch Graph Neural Network Library
☆375Sep 4, 2025Updated 10 months ago
TruX-DTF / FL-VS-APR
View on GitHub
☆16Apr 26, 2021Updated 5 years ago
ASSERT-KTH / sequencer
View on GitHub
Sequence-to-Sequence Learning for End-to-End Program Repair (IEEE TSE 2019). Open-science repo. http://arxiv.org/pdf/1901.01808
☆87Jun 9, 2023Updated 3 years ago
facebookresearch / CodeGen
View on GitHub
Reference implementation of code generation projects from Facebook AI Research. General toolkit to apply machine learning to code, from d…
☆776Mar 12, 2026Updated 4 months ago
TranSMS / M2TS
View on GitHub
☆24Oct 15, 2023Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
BASE-LAB-SJTU / CosBench
View on GitHub
A dataset for natural language code search.
☆14Feb 13, 2020Updated 6 years ago
typilus / typilus-action
View on GitHub
A GitHub Action for suggesting Python type annotations.
☆42Mar 23, 2023Updated 3 years ago
google-research / python-graphs
View on GitHub
A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.
☆343Aug 11, 2023Updated 2 years ago
rizwan09 / REDCODER
View on GitHub
☆44Jun 24, 2025Updated last year
facebookresearch / mbr-exec
View on GitHub
code for "Natural Language to Code Translation with Execution"
☆41Nov 2, 2022Updated 3 years ago
mwcvitkovic / Open-Vocabulary-Learning-on-Source-Code-with-a-Graph-Structured-Cache--Code-Preprocessor
View on GitHub
Library for preprocessing java source code into Augmented ASTs, as per the paper Open Vocabulary Learning on Source Code with a Graph-Str…
☆21Oct 22, 2018Updated 7 years ago
mdrafiqulrabin / tnpa-generalizability
View on GitHub
IST'21 & SANER'22: Semantic-Preserving Program Transformations
☆31Oct 25, 2022Updated 3 years ago