google-research-datasets / C4_200M-synthetic-dataset-for-grammatical-error-correctionView external linksLinks
This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences from C4 using a tagged corruption model. The approach and the dataset are described in more detail by Stahlberg and Kumar (2021) (https://www.aclweb.org/anthology/2021.bea-1.4/)
☆162Sep 24, 2024Updated last year
Alternatives and similar repositories for C4_200M-synthetic-dataset-for-grammatical-error-correction
Users that are interested in C4_200M-synthetic-dataset-for-grammatical-error-correction are comparing it to the libraries listed below
Sorting:
- cLang-8 is a dataset for grammatical error correction.☆112Jul 19, 2022Updated 3 years ago
- Fast + Non-Autoregressive Grammatical Error Correction using BERT. Code and Pre-trained models for paper "Parallel Iterative Edit Models …☆231Mar 24, 2023Updated 2 years ago
- ERRor ANnotation Toolkit: Automatically extract and classify grammatical errors in parallel original and corrected sentences.☆458Mar 26, 2024Updated last year
- Repository to collect and categorize Grammatical Error Correction papers.☆123Jan 30, 2026Updated 2 weeks ago
- Source codes of Neural Quality Estimation with Multiple Hypotheses for Grammatical Error Correction☆43Jul 2, 2021Updated 4 years ago
- Improved version of GECToR☆62Jul 24, 2023Updated 2 years ago
- [EMNLP 2021] LM-Critic: Language Models for Unsupervised Grammatical Error Correction☆120Sep 26, 2021Updated 4 years ago
- Official implementation of the papers "GECToR – Grammatical Error Correction: Tag, Not Rewrite" (BEA-20) and "Text Simplification by Tagg…☆950May 21, 2024Updated last year
- ☆120Sep 9, 2020Updated 5 years ago
- MaxMatch (M^2) Scorer - Evaluation program for grammatical error correction systems.☆157Sep 27, 2022Updated 3 years ago
- The official code of the "Frustratingly Easy System Combination for Grammatical Error Correction" paper☆57Mar 4, 2024Updated last year
- The official code of the 2023 ACL paper "Enhancing Grammatical Error Correction Systems with Explanations"☆29Jul 31, 2023Updated 2 years ago
- This repository contains materials for our tutorial on automatic grammatical error correction: R. Grundkiewicz, C. Bryant, M. Felice: A C…☆38Dec 12, 2020Updated 5 years ago
- Repository of "An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction" (EMNLP-IJCNLP 2019)☆68Dec 23, 2019Updated 6 years ago
- ☆15Mar 15, 2022Updated 3 years ago
- ☆17Jan 8, 2021Updated 5 years ago
- Source code for paper: Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data☆251Jun 3, 2020Updated 5 years ago
- GMEG☆31Nov 21, 2024Updated last year
- Codes for the paper "Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding" (ACL-IJCNLP 2021)☆41Jun 7, 2021Updated 4 years ago
- The code for EMNLP2022 paper "Improved grammatical error correction by ranking elementary edits"☆20Dec 14, 2022Updated 3 years ago
- NeuSpell: A Neural Spelling Correction Toolkit☆706Jul 31, 2023Updated 2 years ago
- A web application that interfaces two GEC systems. [web instance is down]☆32Aug 2, 2024Updated last year
- Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models☆29Apr 27, 2024Updated last year
- Source code for paper Grammatical Error Correction in Low-Resource Scenarios (W-NUT 2019)☆13Jun 21, 2022Updated 3 years ago
- Automatic extraction of edited sentences from text edition histories.☆83Feb 14, 2022Updated 4 years ago
- JFLEG (JHU FLuency-Extended GUG) corpus for Grammatical Error Correction Evaluation☆114Jun 11, 2023Updated 2 years ago
- ☆18Sep 16, 2017Updated 8 years ago
- A framework for detecting, highlighting and correcting grammatical errors on natural language text. Created by Prithiviraj Damodaran. Ope…☆1,569Feb 15, 2023Updated 3 years ago
- American English Pronunciation Dictionary☆34Apr 16, 2018Updated 7 years ago
- Convert Standard M2 format to parallel sentences.☆22Jun 20, 2020Updated 5 years ago
- ACL2023 (Oral): TemplateGEC: Improving Grammatical Error Correction with Detection Template☆22Jul 10, 2023Updated 2 years ago
- Data and code accompanying the paper "As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive…☆22Apr 13, 2023Updated 2 years ago
- ⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.☆589Apr 24, 2023Updated 2 years ago
- Neural quality estimation toolkit for grammatical error correction and other language generation applications.☆49Mar 19, 2019Updated 6 years ago
- evaluation suite for testing automatic grammatical error corrections☆39Jun 12, 2017Updated 8 years ago
- A neural text style transfer model☆12Jun 23, 2019Updated 6 years ago
- ☆129Nov 3, 2022Updated 3 years ago
- ☆59Apr 24, 2021Updated 4 years ago
- Code for EMNLP 2021 paper: Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting☆17Nov 30, 2021Updated 4 years ago