mhagiwara / github-typo-corpusView external linksLinks
GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors
☆516Dec 11, 2019Updated 6 years ago
Alternatives and similar repositories for github-typo-corpus
Users that are interested in github-typo-corpus are comparing it to the libraries listed below
Sorting:
- xfspell — the Transformer Spell Checker☆189Jun 18, 2020Updated 5 years ago
- NanigoNet — Language detector for code-mixed input supporting 150+19 human+programming languages using deep neural networks☆71May 22, 2023Updated 2 years ago
- NeuSpell: A Neural Spelling Correction Toolkit☆706Jul 31, 2023Updated 2 years ago
- Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagg…☆949May 21, 2024Updated last year
- Team Kakao&Brain's Grammatical Error Correction System for the ACL 2019 BEA Shared Task☆92Sep 19, 2019Updated 6 years ago
- Misspelling Oblivious Word Embeddings☆201Aug 6, 2019Updated 6 years ago
- ERRor ANnotation Toolkit: Automatically extract and classify grammatical errors in parallel original and corrected sentences.☆458Mar 26, 2024Updated last year
- GMEG☆31Nov 21, 2024Updated last year
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…☆359Feb 22, 2022Updated 3 years ago
- A Visual Analysis Tool to Explore Learned Representations in Transformers Models☆603Feb 7, 2024Updated 2 years ago
- Data augmentation for NLP☆4,645Jun 24, 2024Updated last year
- An experimental desktop client for using Claude Desktop's MCP with Novelcrafter codices.☆10Dec 3, 2024Updated last year
- Fast, general, and tested differentiable structured prediction in PyTorch☆1,123Apr 20, 2022Updated 3 years ago
- Fast + Non-Autoregressive Grammatical Error Correction using BERT. Code and Pre-trained models for paper "Parallel Iterative Edit Models …☆231Mar 24, 2023Updated 2 years ago
- MaxMatch (M^2) Scorer - Evaluation program for grammatical error correction systems.☆157Sep 27, 2022Updated 3 years ago
- A python tool for evaluating the quality of sentence embeddings.☆2,107Mar 19, 2024Updated last year
- ☆10Sep 14, 2022Updated 3 years ago
- 11.5기의 beyondBERT의 토론 내용을 정리하는 repository입니다.☆57Jul 2, 2020Updated 5 years ago
- Unsupervised text tokenizer focused on computational efficiency☆977Mar 29, 2024Updated last year
- Entity Linker solution☆1,205Sep 21, 2023Updated 2 years ago
- PyTorch original implementation of Cross-lingual Language Model Pretraining.☆2,924Feb 14, 2023Updated 2 years ago
- Longformer: The Long-Document Transformer☆2,184Feb 8, 2023Updated 3 years ago
- A bot to add citation data from OpenCitations to Wikidata☆12May 23, 2023Updated 2 years ago
- Modern spell checking library - accurate, fast, multi-language☆658Aug 29, 2024Updated last year
- Source code for paper: Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data☆251Jun 3, 2020Updated 5 years ago
- SDK for TEASPN, a framework and a protocol for integrated writing assistance environments☆60Dec 9, 2022Updated 3 years ago
- Pytorch implementation and pre-trained Japanese model for CANINE, the efficient character-level transformer.☆89Nov 3, 2023Updated 2 years ago
- Beyond Accuracy: Behavioral Testing of NLP models with CheckList☆2,048Jan 9, 2024Updated 2 years ago
- 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy☆1,403Nov 7, 2025Updated 3 months ago
- A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)☆28May 21, 2021Updated 4 years ago
- jiant is an nlp toolkit☆1,675Jul 6, 2023Updated 2 years ago
- Data collection, alignment and TAUS repository☆23Nov 30, 2017Updated 8 years ago
- ☆120Sep 9, 2020Updated 5 years ago
- New dataset☆311Aug 31, 2021Updated 4 years ago
- FastFormers - highly efficient transformer models for NLU☆709Mar 21, 2025Updated 10 months ago
- A Comprehensive survey on business use cases of AI that help them thrive in the digital economy☆13Oct 7, 2020Updated 5 years ago
- A lightweight but powerful library to build token indices for NLP tasks, compatible with major Deep Learning frameworks like PyTorch and …☆51Dec 6, 2024Updated last year
- Ruby binding for the igraph library.☆33Aug 13, 2009Updated 16 years ago
- baikal.ai's pre-trained BERT models: descriptions and sample codes☆12Jun 24, 2021Updated 4 years ago