A Pre-trained BERT on StackOverflow Corpus
☆47Feb 27, 2021Updated 5 years ago
Alternatives and similar repositories for BERTOverflow
Users that are interested in BERTOverflow are comparing it to the libraries listed below
Sorting:
- Source Code and Data for Software Domain NER☆147Dec 21, 2022Updated 3 years ago
- TDCleaner: A Tool for Detecting Obsolete TODO Comments in Software Repos☆12Dec 9, 2021Updated 4 years ago
- This repository will contain the data and codes for WNUT 2020 NER task☆52Dec 21, 2022Updated 3 years ago
- A collection of publications that works on code models but beyond focusing on the accuracies.☆13Jun 30, 2023Updated 2 years ago
- HashtagMaster: Segmentation tool for hashtags☆12Oct 27, 2020Updated 5 years ago
- The replication package of <Sentiment Analysis for Software Engineering: How Far Can Pre-trained Transformer Models Go?>. Accepted by IC…☆11Nov 29, 2023Updated 2 years ago
- This repository contains the code for applying One-Token Approximation to a pretrained language model using subword-level tokenization.☆11May 7, 2020Updated 5 years ago
- ☆15Nov 5, 2020Updated 5 years ago
- CrossSim (exploting Cross Project Relationships for Computing Open Source Software Similarity), is an approach that allows us to represen…☆14May 20, 2022Updated 3 years ago
- Code Snippet Recommendation from Stack Overflow Post☆19Jun 30, 2021Updated 4 years ago
- ☆14Jun 23, 2020Updated 5 years ago
- A dataset for natural language code search.☆14Feb 13, 2020Updated 6 years ago
- CD4Py: Code De-Duplication for Python☆23Dec 13, 2020Updated 5 years ago
- 📒Record some paper read notes☆20Jan 1, 2022Updated 4 years ago
- Annotated corpus and code for "Extracting COVID-19 Events from Twitter".☆44May 19, 2022Updated 3 years ago
- FOCUS is a context-aware collaborative-filtering system that exploits cross relationships among OSS projects to suggest the inclusion of …☆21Jun 14, 2023Updated 2 years ago
- Evaluation of source authorship attribution tool☆23Jun 5, 2021Updated 4 years ago
- AVATAR: Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations☆26Apr 26, 2021Updated 4 years ago
- Benchmarks for Kaggle's Predict Closed Questions on Stack Overflow competition☆55Mar 19, 2016Updated 9 years ago
- CLCDSA: Cross Language Code Clone Detection using Syntactical Features and API Documentation☆22Jun 29, 2025Updated 8 months ago
- Content for Applied ML Workshop @ DataHack Summit 2019☆24Nov 16, 2019Updated 6 years ago
- DataFountain疫情问答助手keras-bert实现☆27Jun 7, 2020Updated 5 years ago
- Stacked Denoising BERT for Noisy Text Classification (Neural Networks 2020)☆32Nov 28, 2022Updated 3 years ago
- Your library for dynamic language modeling☆66Oct 23, 2018Updated 7 years ago
- Template and steps to build your personal blog using Jekyll and Minimal Mistake☆10Feb 24, 2020Updated 6 years ago
- ☆10Feb 2, 2021Updated 5 years ago
- Implementation of PCA algorithm using Gram-Scmidt modification on NIPALS☆10Jun 13, 2015Updated 10 years ago
- A parallel evaluation data set of SAP software documentation with document structure annotation☆14Jul 30, 2025Updated 7 months ago
- Página pessoal responsiva para exibir e divulgar links interessantes ao seu público☆27Dec 31, 2023Updated 2 years ago
- Implementation of End-to-End Query Term Weighting (TW-BERT)☆34Jun 29, 2025Updated 8 months ago
- Code for ACL '19 paper: Towards Improving Neural Named Entity Recognition with Gazetteers☆32Jul 2, 2021Updated 4 years ago
- Self-supervised NER prototype - updated version (69 entity types - 17 broad entity groups). Uses pretrained BERT models with no fine tuni…☆78Jul 16, 2022Updated 3 years ago
- 基于轻量级的albert实现albert+BiLstm+CRF☆93May 25, 2023Updated 2 years ago
- Open-science repo for our experimental results of automatic software repair on the Defects4J benchmark of Java bugs☆37Nov 12, 2025Updated 3 months ago
- Modified version of fairseq, including new implementations for criterions using reinforcement learning methods.☆11Aug 14, 2019Updated 6 years ago
- Import entities from another Wikibase instance (e.g. Wikidata)☆13May 21, 2023Updated 2 years ago
- The dataset contains Wikipedia comments which have been labeled by human raters for toxic behavior.☆11Jun 20, 2020Updated 5 years ago
- ☆10Oct 30, 2019Updated 6 years ago
- General purpose pre-commit hooks used by BestDoctor for Python projects.☆12Jan 18, 2022Updated 4 years ago