Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code
☆66Dec 3, 2021Updated 4 years ago
Alternatives and similar repositories for code_transformers
Users that are interested in code_transformers are comparing it to the libraries listed below
Sorting:
- IST'21 & SANER'22: Semantic-Preserving Program Transformations☆31Oct 25, 2022Updated 3 years ago
- ☆13Jul 6, 2023Updated 2 years ago
- ESEC/FSE'21: Prediction-Preserving Program Simplification☆10Oct 4, 2022Updated 3 years ago
- This is the code repository for our ICPC 2021 paper "Improving Code Summarization with Block-wise Abstract Syntax Tree Splitting"☆24Jan 3, 2023Updated 3 years ago
- Library for preprocessing java source code into Augmented ASTs, as per the paper Open Vocabulary Learning on Source Code with a Graph-Str…☆21Oct 22, 2018Updated 7 years ago
- The dataset for the variable-misuse task, used in the ICLR 2020 paper 'Global Relational Models of Source Code' [https://openreview.net/f…☆22Aug 19, 2020Updated 5 years ago
- A large dataset of 4.2m Java source code and parallel data of their description from code search, and code summarization studies.☆15Feb 24, 2022Updated 4 years ago
- An IntelliJ-based IDE plugin for Python AST transformations☆18Aug 16, 2023Updated 2 years ago
- Replication package for EMNLP 2021 paper: CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract S…☆34Feb 16, 2022Updated 4 years ago
- PLUR (Programming-Language Understanding and Repair) is a collection of source code datasets suitable for graph-based machine learning. W…☆87Apr 5, 2022Updated 3 years ago
- Code for ICML 2021 paper: How could Neural Networks understand Programs?☆123Nov 7, 2024Updated last year
- 🔍 Code Search Tools & Experiments☆12Updated this week
- This repo will contain replication package for the paper "Feeding Trees to Transformers for Code Completion"☆99Jun 3, 2022Updated 3 years ago
- ☆30Nov 23, 2020Updated 5 years ago
- Implementation of the paper "Language-agnostic representation learning of source code from structure and context".☆172Apr 6, 2022Updated 3 years ago
- Deep Just-In-Time Inconsistency Detection Between Comments and Source Code: Artifact☆22Jul 21, 2025Updated 7 months ago
- [ICSE 2021] - InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees☆89Aug 8, 2025Updated 6 months ago
- ☆23Aug 6, 2020Updated 5 years ago
- Code for the ICPC 2020 paper Improved Source Code Summarization via a Graph Neural Network☆68Apr 9, 2021Updated 4 years ago
- Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].☆186Mar 1, 2022Updated 3 years ago
- A Tree-Based Transformer Architecture for Code Generation. (AAAI'20)☆91May 31, 2022Updated 3 years ago
- Official repository for the paper "GN-Transformer: Fusing AST and Source Code information in Graph Networks".☆17May 25, 2025Updated 9 months ago
- Contrastive Code Representation Learning: functionality-based JavaScript embeddings through self-supervised learning☆169Dec 26, 2021Updated 4 years ago
- ☆61Dec 21, 2023Updated 2 years ago
- Data and Code for Reproducing "Global Relational Models of Source Code"☆85May 10, 2021Updated 4 years ago
- Sequence-to-Sequence Learning for End-to-End Program Repair (IEEE TSE 2019). Open-science repo. http://arxiv.org/pdf/1901.01808☆86Jun 9, 2023Updated 2 years ago
- NaturalCC: An Open-Source Toolkit for Code Intelligence☆313Feb 6, 2026Updated 3 weeks ago
- Set of tools to help working with "Big Code"☆42Apr 28, 2022Updated 3 years ago
- A toolkit for pre-processing large source code corpora☆45Sep 30, 2022Updated 3 years ago
- [AAAI 2021] - TreeCaps: Tree-based Capsule Network for Source Code Processing☆23Mar 24, 2023Updated 2 years ago
- Mining tool and large-scale datasets of single statement bug fixes in Python☆19Nov 29, 2023Updated 2 years ago
- Official implementation of our work, A Transformer-based Approach for Source Code Summarization [ACL 2020].☆195May 28, 2022Updated 3 years ago
- ☆50Feb 12, 2020Updated 6 years ago
- A redistributable subset of the ETH Py150 corpus [https://www.sri.inf.ethz.ch/py150], introduced in the ICML 2020 paper 'Learning and Eva…☆32Aug 11, 2020Updated 5 years ago
- Utilities used by the Deep Program Understanding team☆104Jun 12, 2023Updated 2 years ago
- Towards converting multilingual source code into one language-agnostic graph representation.☆48Mar 22, 2023Updated 2 years ago
- Code for the paper "Embedding Java Classes with code2vec: Improvements from Variable Obfuscation" in MSR 2020☆32Mar 24, 2023Updated 2 years ago
- Code for "StructCoder: Structure-Aware Transformer for Code Generation"☆79Jan 21, 2024Updated 2 years ago
- Web queries dataset for code search☆32Jun 3, 2023Updated 2 years ago