oliverguhr / spelling
This is a neural spell checker
β63Updated 2 years ago
Alternatives and similar repositories for spelling:
Users that are interested in spelling are comparing it to the libraries listed below
- A french sequence to sequence pretrained modelβ57Updated 2 years ago
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)β40Updated last year
- πAn easy-to-use package to restore punctuation of the text.β112Updated last year
- OpusFilter - Parallel corpus processing toolkitβ104Updated 3 weeks ago
- Bicleaner fork that uses neural networksβ39Updated 6 months ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.β49Updated last month
- β45Updated 6 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.β151Updated 8 months ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.β72Updated last year
- Language Identification with Support for More Than 2000 Labels -- EMNLP 2023β116Updated 2 months ago
- This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences froβ¦β160Updated 4 months ago
- Reduce the size of pretrained Hugging Face models via vocabulary trimming.β43Updated 2 years ago
- MAFAND-MTβ55Updated 7 months ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.β77Updated 5 months ago
- A tool that locates, downloads, and extracts machine translation corporaβ150Updated 8 months ago
- A collection of scripts to preprocess ASR datasets and finetune language-specific Wav2Vec2 XLSR modelsβ31Updated 3 years ago
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.β59Updated 2 years ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)β70Updated 9 months ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentationβ24Updated last year
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.β102Updated 2 years ago
- xfspell β the Transformer Spell Checkerβ188Updated 4 years ago
- NTREX -- News Test References for MT Evaluationβ81Updated 8 months ago
- Improved Sentence Alignment in Linear Time and Spaceβ165Updated last year
- β136Updated 11 months ago
- cLang-8 is a dataset for grammatical error correction.β103Updated 2 years ago
- Complimentary code for our paper Automatic punctuation restoration with BERT modelsβ48Updated last year
- Curriculum trainingβ16Updated last month
- Efficient Low-Memory Alignerβ141Updated last month
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.β154Updated 8 months ago
- β25Updated last year