sotorrent / db-scriptsLinks
SQL and Bash scripts to import the offical Stack Overflow data dump and the SOTorrent data set, to retrieve Stack Overflow references from the BigQuery GitHub data set, and to retrieve data from the SOTorrent dataset for analysis.
☆15Updated last month
Alternatives and similar repositories for db-scripts
Users that are interested in db-scripts are comparing it to the libraries listed below
Sorting:
- A toolkit for pre-processing large source code corpora☆47Updated 3 years ago
- Tree-based Autofolding Software Summarization Algorithm☆43Updated 9 years ago
- Artifacts and other data for "Code Vectors: Understanding Programs Through Embedded Abstraced Symbolic Traces"☆22Updated 5 years ago
- ☆14Updated last year
- ☆50Updated 5 years ago
- Probabilistic API Mining☆53Updated 7 years ago
- The dataset for the variable-misuse task, used in the ICLR 2020 paper 'Global Relational Models of Source Code' [https://openreview.net/f…☆22Updated 5 years ago
- A Systematic Literature Review of Deep Learning in Software Engineering☆19Updated last year
- the code for three models introduced in DYNAMIC NEURAL PROGRAM EMBEDDINGS FOR PROGRAM REPAIR (ICLR 18)☆32Updated 7 years ago
- MLonCode community effort to implement Learning Distributed Representations of Code (https://arxiv.org/pdf/1803.09473.pdf)☆39Updated 7 years ago
- C# Data Extraction for "Learning to Represent Edits"☆26Updated 7 years ago
- Contains the code for our ICSE 2020 paper: Big Code != Big Vocabulary: Open-Vocabulary Language Models for Source Code and for its earlie…☆84Updated 2 years ago
- Convert source code into numerical tokens☆65Updated 2 years ago
- code2vec: Learning Distributed Representations of Code☆14Updated 7 years ago
- DiffSearch is a search engine for code changes. The input is a query that describes a code change and the output is a list of matching co…☆19Updated last year
- Evaluation of source authorship attribution tool☆23Updated 4 years ago
- BEE (Bug rEport analyzEr), a tool for structuring and analyzing bug reports☆26Updated last year
- Bilateral Neural Network implementation in Tensorflow☆51Updated 6 years ago
- A set of tools for extracting tokens and ASTs from code☆22Updated 7 years ago
- ☆24Updated 4 years ago
- mwcvitkovic / Open-Vocabulary-Learning-on-Source-Code-with-a-Graph-Structured-Cache--Code-PreprocessorLibrary for preprocessing java source code into Augmented ASTs, as per the paper Open Vocabulary Learning on Source Code with a Graph-Str…☆21Updated 7 years ago
- ☆13Updated 2 years ago
- Dataset and code corresponding to Associating Natural Language Comment and Source Code Entities (AAAI 2020)☆20Updated 5 years ago
- ☆24Updated 3 years ago
- ☆23Updated 2 years ago
- NL2Type: Inferring JavaScript Function Types from Natural Language Information☆23Updated 6 years ago
- Hosts our tool for mining simple "stupid'' bugs (SStuBs).☆37Updated 3 years ago
- A benchmark for evaluating embeddings of identifiers in source code.☆22Updated 4 years ago
- an implementation of "code2vec: Learning Distributed Representations of Code"☆30Updated last year
- Re-implementation of "CODE2SEQ: GENERATING SEQUENCES FROM STRUCTURED REPRESENTATIONS OF CODE"☆45Updated last year