dspinellis / tokenizer
Convert source code into numerical tokens
☆65Updated last year
Alternatives and similar repositories for tokenizer:
Users that are interested in tokenizer are comparing it to the libraries listed below
- Sequence-to-Sequence Learning for End-to-End Program Repair (IEEE TSE 2019). Open-science repo. http://arxiv.org/pdf/1901.01808☆84Updated last year
- A toolkit for pre-processing large source code corpora☆47Updated 2 years ago
- Contains the code for our ICSE 2020 paper: Big Code != Big Vocabulary: Open-Vocabulary Language Models for Source Code and for its earlie…☆83Updated last year
- A tool to convert nodes in an Abstract Syntax Tree into vector embeddings☆74Updated 2 years ago
- Pyc-cfg is a pure python control flow graph builder for almost all Ansi C programming language.☆51Updated 7 years ago
- MLonCode community effort to implement Learning Distributed Representations of Code (https://arxiv.org/pdf/1803.09473.pdf)☆39Updated 6 years ago
- srcML Toolkit☆129Updated this week
- Set of tools to help working with "Big Code"☆43Updated 2 years ago
- ☆66Updated 2 years ago
- Reproduce the results of Tree-based Convolutional Neural Network (TBCNN)☆39Updated last year
- Repository for Deep API Learning (DeepAPI)☆54Updated 3 years ago
- ☆112Updated 2 years ago
- ☆49Updated 2 years ago
- ☆25Updated 3 years ago
- an implementation of "code2vec: Learning Distributed Representations of Code"☆30Updated 8 months ago
- Neural Code Translator provides instructions, datasets, and a deep learning infrastructure (based on seq2seq) that aims at learning code …☆38Updated 5 years ago
- PyTorch's implementation of the code2seq model.☆61Updated 7 months ago
- 58069 Java source code diffs. http://arxiv.org/pdf/1807.03200☆93Updated 5 years ago
- Description: We want to create a deep Neural Network that can automatically generate comments for code snippets passed to it. The motiva…☆44Updated 2 years ago
- AutoenCODE is a Deep Learning infrastructure that allows to encode source code fragments into vector representations, which can be used t…☆61Updated 6 years ago
- ☆31Updated 6 years ago
- ☆56Updated last year
- AVATAR: Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations☆26Updated 3 years ago
- Demonstration of the path-extraction process shown in the paper "A General Path-Based Representation for Predicting Program Properties"☆24Updated 3 years ago
- A set of tools for extracting tokens and ASTs from code☆22Updated 6 years ago
- the code for three models introduced in DYNAMIC NEURAL PROGRAM EMBEDDINGS FOR PROGRAM REPAIR (ICLR 18)☆32Updated 6 years ago
- src2abs is a tool that abstracts Java source code☆35Updated 5 years ago
- Probabilistic API Mining☆53Updated 7 years ago
- Defects4J Dissection presents data to help researchers and practitioners to better understand the Defects4J bug dataset☆61Updated last year
- ☆25Updated 4 years ago