GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors
☆514Dec 11, 2019Updated 6 years ago
Alternatives and similar repositories for github-typo-corpus
Users that are interested in github-typo-corpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- xfspell — the Transformer Spell Checker☆189Jun 18, 2020Updated 5 years ago
- NanigoNet — Language detector for code-mixed input supporting 150+19 human+programming languages using deep neural networks☆71May 22, 2023Updated 2 years ago
- NeuSpell: A Neural Spelling Correction Toolkit☆711Jul 31, 2023Updated 2 years ago
- GMEG☆31Nov 21, 2024Updated last year
- Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagg…☆959May 21, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ERRor ANnotation Toolkit: Automatically extract and classify grammatical errors in parallel original and corrected sentences.☆461Mar 26, 2024Updated 2 years ago
- Team Kakao&Brain's Grammatical Error Correction System for the ACL 2019 BEA Shared Task☆93Sep 19, 2019Updated 6 years ago
- Misspelling Oblivious Word Embeddings☆201Aug 6, 2019Updated 6 years ago
- MaxMatch (M^2) Scorer - Evaluation program for grammatical error correction systems.☆158Sep 27, 2022Updated 3 years ago
- Fast + Non-Autoregressive Grammatical Error Correction using BERT. Code and Pre-trained models for paper "Parallel Iterative Edit Models …☆231Mar 24, 2023Updated 3 years ago
- 11.5기의 beyondBERT의 토론 내용을 정리하는 repository입니다.☆57Jul 2, 2020Updated 5 years ago
- ☆120Sep 9, 2020Updated 5 years ago
- ☆17Jan 8, 2021Updated 5 years ago
- ☆10Sep 14, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Data augmentation for NLP☆4,652Jun 24, 2024Updated last year
- Source code for paper Grammatical Error Correction in Low-Resource Scenarios (W-NUT 2019)☆13Jun 21, 2022Updated 3 years ago
- A Visual Analysis Tool to Explore Learned Representations in Transformers Models☆603Feb 7, 2024Updated 2 years ago
- A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)☆28May 21, 2021Updated 4 years ago
- Universal Dependency Treebanks in Korean☆38Dec 19, 2021Updated 4 years ago
- Unsupervised text tokenizer focused on computational efficiency☆977Mar 29, 2024Updated last year
- Modern spell checking library - accurate, fast, multi-language☆658Aug 29, 2024Updated last year
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…☆359Feb 22, 2022Updated 4 years ago
- A python tool for evaluating the quality of sentence embeddings.☆2,106Mar 19, 2024Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Fast, general, and tested differentiable structured prediction in PyTorch☆1,124Apr 20, 2022Updated 3 years ago
- Automatic extraction of edited sentences from text edition histories.☆83Feb 14, 2022Updated 4 years ago
- Source code for paper: Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data☆251Jun 3, 2020Updated 5 years ago
- PyTorch original implementation of Cross-lingual Language Model Pretraining.☆2,927Feb 14, 2023Updated 3 years ago
- Models, system configurations and outputs of our winning GEC systems in the BEA 2019 shared task described in R. Grundkiewicz, M. Junczys…☆51Oct 22, 2019Updated 6 years ago
- Pytorch implementation and pre-trained Japanese model for CANINE, the efficient character-level transformer.☆89Nov 3, 2023Updated 2 years ago
- baikal.ai's pre-trained BERT models: descriptions and sample codes☆12Jun 24, 2021Updated 4 years ago
- SDK for TEASPN, a framework and a protocol for integrated writing assistance environments☆60Dec 9, 2022Updated 3 years ago
- Longformer: The Long-Document Transformer☆2,188Feb 8, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- jiant is an nlp toolkit☆1,674Jul 6, 2023Updated 2 years ago
- BLEURT is a metric for Natural Language Generation based on transfer learning.☆789Aug 4, 2023Updated 2 years ago
- Code and model files for the paper: "A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction" (AAAI-18…☆184Dec 13, 2018Updated 7 years ago
- Deliver the ready-to-train data to your NLP model.☆122Jul 15, 2022Updated 3 years ago
- Entity Linker solution☆1,206Sep 21, 2023Updated 2 years ago
- Unsupervised Word Segmentation for Neural Machine Translation and Text Generation☆2,267Aug 7, 2024Updated last year
- 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy☆1,403Mar 20, 2026Updated last week