sotorrent / db-scripts
SQL and Bash scripts to import the offical Stack Overflow data dump and the SOTorrent data set, to retrieve Stack Overflow references from the BigQuery GitHub data set, and to retrieve data from the SOTorrent dataset for analysis.
☆14Updated last month
Related projects ⓘ
Alternatives and complementary repositories for db-scripts
- ☆11Updated 5 months ago
- A Systematic Literature Review of Deep Learning in Software Engineering☆18Updated 2 months ago
- A benchmark for evaluating embeddings of identifiers in source code.☆22Updated 3 years ago
- C# Data Extraction for "Learning to Represent Edits"☆27Updated 6 years ago
- Restoring Execution Environments of Jupyter Notebooks☆21Updated last year
- Characterizing the natural language descriptions in software logging statements [ASE'18]☆16Updated 5 years ago
- ICSE 2021 Artifact for: Shipwright: A Human-in-the-Loop System for Dockerfile Repair.☆22Updated 3 years ago
- A tool for mining graph-based change patterns in Python code☆19Updated 5 months ago
- MLonCode community effort to implement Learning Distributed Representations of Code (https://arxiv.org/pdf/1803.09473.pdf)☆40Updated 6 years ago
- ICSE'18: Tuning Smote☆11Updated 6 years ago
- Course material for Algorithms and Data Structures (TU Delft TI3110TU)☆10Updated 6 years ago
- code and data for paper "BASHEXPLAINER: Retrieval-Augmented Bash Code Comment Generation based on Fine-tuned CodeBERT", which accepted in…☆11Updated 2 years ago
- Babelfish Python client☆16Updated 5 years ago
- A toolkit for pre-processing large source code corpora☆46Updated 2 years ago
- Smelling smells using Deep Learning☆44Updated 3 years ago
- an implementation of "code2vec: Learning Distributed Representations of Code"☆29Updated 4 months ago
- ☆16Updated 4 months ago
- The dataset for the variable-misuse task, used in the ICLR 2020 paper 'Global Relational Models of Source Code' [https://openreview.net/f…☆22Updated 4 years ago
- ☆16Updated 4 years ago
- Source code and data about our large scale study about Java annotaion in practice☆12Updated last year
- AST factorization: transformation AST of Kotlin source code to a vector☆11Updated 5 years ago
- 🤓 user2code2vec: Embeddings for Profiling Students Based on Distributional Representations of Source Code. Full Paper presented at Learn…☆22Updated 5 years ago
- Tree-based Autofolding Software Summarization Algorithm☆42Updated 8 years ago
- Probabilistic Itemset Mining☆19Updated 8 years ago
- Code for "Typilus: Neural Type Hints" PLDI 2020☆59Updated last year
- Code Snippet Recommendation from Stack Overflow Post☆17Updated 3 years ago
- Code for "CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning" (WWW 2019)☆36Updated 4 years ago
- Language-independent, search-based program repair -- just your cup of tea! ☕☆28Updated 4 months ago
- Code for the paper "A Structural Model for Contextual Code Changes"☆27Updated last year
- Repository of the paper 'CodeQueries: A Dataset of Semantic Queries over Code' published in ISEC 2024☆12Updated 6 months ago