sotorrent / db-scriptsLinks
SQL and Bash scripts to import the offical Stack Overflow data dump and the SOTorrent data set, to retrieve Stack Overflow references from the BigQuery GitHub data set, and to retrieve data from the SOTorrent dataset for analysis.
☆15Updated 3 months ago
Alternatives and similar repositories for db-scripts
Users that are interested in db-scripts are comparing it to the libraries listed below
Sorting:
- Tree-based Autofolding Software Summarization Algorithm☆43Updated 9 years ago
- ☆50Updated 5 years ago
- ☆14Updated last year
- C# Data Extraction for "Learning to Represent Edits"☆27Updated 7 years ago
- Babelfish Python client☆17Updated 6 years ago
- A toolkit for pre-processing large source code corpora☆45Updated 3 years ago
- the code for three models introduced in DYNAMIC NEURAL PROGRAM EMBEDDINGS FOR PROGRAM REPAIR (ICLR 18)☆33Updated 7 years ago
- an implementation of "code2vec: Learning Distributed Representations of Code"☆30Updated last year
- code2vec: Learning Distributed Representations of Code☆14Updated 7 years ago
- A Systematic Literature Review of Deep Learning in Software Engineering☆20Updated last year
- mwcvitkovic / Open-Vocabulary-Learning-on-Source-Code-with-a-Graph-Structured-Cache--Code-PreprocessorLibrary for preprocessing java source code into Augmented ASTs, as per the paper Open Vocabulary Learning on Source Code with a Graph-Str…☆21Updated 7 years ago
- MLonCode community effort to implement Learning Distributed Representations of Code (https://arxiv.org/pdf/1803.09473.pdf)☆39Updated 7 years ago
- Bilateral Neural Network implementation in Tensorflow☆51Updated 6 years ago
- Website for Learning from "Big Code"☆30Updated 4 years ago
- Artifacts and other data for "Code Vectors: Understanding Programs Through Embedded Abstraced Symbolic Traces"☆22Updated 5 years ago
- 🤓 user2code2vec: Embeddings for Profiling Students Based on Distributional Representations of Source Code. Full Paper presented at Learn…☆22Updated 6 years ago
- An empirical study on patch correctness☆15Updated 3 years ago
- Probabilistic API Mining☆53Updated 7 years ago
- The dataset for the variable-misuse task, used in the ICLR 2020 paper 'Global Relational Models of Source Code' [https://openreview.net/f…☆22Updated 5 years ago
- ☆13Updated 2 years ago
- Contains the code for our ICSE 2020 paper: Big Code != Big Vocabulary: Open-Vocabulary Language Models for Source Code and for its earlie…☆84Updated 2 years ago
- DeepBugs is a framework for learning bug detectors from an existing code corpus.☆152Updated 4 years ago
- Set of tools to help working with "Big Code"☆42Updated 3 years ago
- A set of tools for extracting tokens and ASTs from code☆22Updated 7 years ago
- Convert source code into numerical tokens☆65Updated 2 years ago
- Dataset and code corresponding to Associating Natural Language Comment and Source Code Entities (AAAI 2020)☆20Updated 5 years ago
- DiffSearch is a search engine for code changes. The input is a query that describes a code change and the output is a list of matching co…☆19Updated last year
- Mining tool and large-scale datasets of single statement bug fixes in Python☆18Updated 2 years ago
- AST factorization: transformation AST of Kotlin source code to a vector☆11Updated 6 years ago
- Code for "Typilus: Neural Type Hints" PLDI 2020☆62Updated 2 years ago