cedricrupb / code_tokenizeLinks

Fast tokenization and structural analysis of any programming language

☆58

Alternatives and similar repositories for code_tokenize

Users that are interested in code_tokenize are comparing it to the libraries listed below

Sorting:

eth-sri / TFix
☆67Updated 3 years ago
cedricrupb / TSSB3M
Mining tool and large-scale datasets of single statement bug fixes in Python
☆17Updated last year
giganticode / codeprep
A toolkit for pre-processing large source code corpora
☆47Updated 2 years ago
cedricrupb / code_diff
Fast AST based code differencing in Python
☆34Updated 6 months ago
amazon-science / recode
Releasing code for "ReCode: Robustness Evaluation of Code Generation Models"
☆52Updated last year
cedricrupb / code_ast
Fast and robust AST parsing of any language
☆49Updated 6 months ago
google-research / plur
PLUR (Programming-Language Understanding and Repair) is a collection of source code datasets suitable for graph-based machine learning. W…
☆87Updated 3 years ago
giganticode / probes
Probing pre-trained source code models
☆15Updated 3 years ago
tech-srl / c3po
Code for the paper "A Structural Model for Contextual Code Changes"
☆32Updated last year
google-research-datasets / great
The dataset for the variable-misuse task, used in the ICLR 2020 paper 'Global Relational Models of Source Code' [https://openreview.net/f…
☆22Updated 4 years ago
VHellendoorn / ICLR20-Great
Data and Code for Reproducing "Global Relational Models of Source Code"
☆84Updated 4 years ago
bayesgroup / code_transformers
Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Sourc…
☆63Updated 3 years ago
terryyz / DataAug4Code
Source Code Data Augmentation for Deep Learning: A Survey.
☆67Updated last year
shrivastavadisha / repo_level_prompt_generation
☆124Updated 2 years ago
dpfried / incoder
Generative model for code infilling and synthesis
☆304Updated last year
bdqnghi / awesome-ai4code
A collection of recent papers, benchmarks and datasets of AI4Code domain.
☆58Updated last year
utopia-group / TypeT5
Seq2seq Type Inference using Static Analysis and CodeT5
☆31Updated 2 years ago
saltudelft / type4py
Type4Py: Deep Similarity Learning-Based Type Inference for Python
☆65Updated last year
EngineeringSoftware / CoditT5
CoditT5: Pretraining for Source Code and Natural Language Editing
☆28Updated 6 months ago
microsoft / neurips21-self-supervised-bug-detection-and-repair
Replication Code for "Self-Supervised Bug Detection and Repair" NeurIPS 2021
☆111Updated 2 years ago
danhper / bigcode-tools
Set of tools to help working with "Big Code"
☆43Updated 3 years ago
wasiahmad / AVATAR
Official code of our work, AVATAR: A Parallel Corpus for Java-Python Program Translation.
☆55Updated last year
danielzuegner / code-transformer
Implementation of the paper "Language-agnostic representation learning of source code from structure and context".
☆169Updated 3 years ago
eth-sri / learning-real-bug-detector
☆16Updated 10 months ago
parasj / contracode
Contrastive Code Representation Learning: functionality-based JavaScript embeddings through self-supervised learning
☆168Updated 3 years ago
mahimanzum / FixEval
We introduce FixEval , a dataset for competitive programming bug fixing along with a comprehensive test suite and show the necessity of e…
☆23Updated 2 years ago
csebuetnlp / CoDesc
A large dataset of 4.2m Java source code and parallel data of their description from code search, and code summarization studies.
☆53Updated 3 years ago
rizwan09 / REDCODER
☆45Updated last month
saikat107 / Codit
☆13Updated 2 years ago
tech-srl / slm-code-generation
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)
☆89Updated 3 years ago