sotorrent / db-scriptsLinks
SQL and Bash scripts to import the offical Stack Overflow data dump and the SOTorrent data set, to retrieve Stack Overflow references from the BigQuery GitHub data set, and to retrieve data from the SOTorrent dataset for analysis.
☆16Updated 5 months ago
Alternatives and similar repositories for db-scripts
Users that are interested in db-scripts are comparing it to the libraries listed below
Sorting:
- A toolkit for pre-processing large source code corpora☆47Updated 2 years ago
- ☆14Updated last year
- mwcvitkovic / Open-Vocabulary-Learning-on-Source-Code-with-a-Graph-Structured-Cache--Code-PreprocessorLibrary for preprocessing java source code into Augmented ASTs, as per the paper Open Vocabulary Learning on Source Code with a Graph-Str…☆21Updated 6 years ago
- ☆50Updated 5 years ago
- the code for three models introduced in DYNAMIC NEURAL PROGRAM EMBEDDINGS FOR PROGRAM REPAIR (ICLR 18)☆32Updated 7 years ago
- Explainable AI for Software Engineering: A Hands-on Guide on How to Make Software Analytics More Practical, Explainable, and Actionable (…☆26Updated 3 years ago
- Evaluation of source authorship attribution tool☆23Updated 4 years ago
- DeepBugs is a framework for learning bug detectors from an existing code corpus.☆151Updated 4 years ago
- Probabilistic API Mining☆53Updated 7 years ago
- ☆24Updated 4 years ago
- Tree-based Autofolding Software Summarization Algorithm☆43Updated 9 years ago
- Code for paper "Lancer: Your Code Tell Me What You Need"☆11Updated 3 years ago
- C# Data Extraction for "Learning to Represent Edits"☆26Updated 6 years ago
- A Systematic Literature Review of Deep Learning in Software Engineering☆19Updated last year
- Code for "Typilus: Neural Type Hints" PLDI 2020☆61Updated 2 years ago
- Tool for analyzing git log messages and diffs.☆22Updated 4 years ago
- Artifacts and other data for "Code Vectors: Understanding Programs Through Embedded Abstraced Symbolic Traces"☆22Updated 5 years ago
- makecfg is a tool for making CFG(Control Flow Graph) from binary.☆18Updated 3 years ago
- Bilateral Neural Network implementation in Tensorflow☆51Updated 6 years ago
- Dataset and code corresponding to Associating Natural Language Comment and Source Code Entities (AAAI 2020)☆20Updated 4 years ago
- The dataset for the variable-misuse task, used in the ICLR 2020 paper 'Global Relational Models of Source Code' [https://openreview.net/f…☆22Updated 5 years ago
- ☆49Updated 2 years ago
- A Pre-trained BERT on StackOverflow Corpus☆47Updated 4 years ago
- A set of tools for extracting tokens and ASTs from code☆22Updated 7 years ago
- A dynamic method for detecting faults in incremental and parallel builds.☆19Updated 3 years ago
- Convert source code into numerical tokens☆65Updated 2 years ago
- Git history navigation for dedicated methods, across all kinds of changes incl. complex refactorings.☆42Updated last year
- ☆13Updated 2 years ago
- code2vec: Learning Distributed Representations of Code☆14Updated 7 years ago
- A benchmark for evaluating embeddings of identifiers in source code.☆22Updated 4 years ago