oliverguhr / spelling
This is a neural spell checker
☆62Updated 2 years ago
Alternatives and similar repositories for spelling:
Users that are interested in spelling are comparing it to the libraries listed below
- A spaCy custom component that extracts and normalizes temporal expressions☆52Updated last year
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆38Updated last year
- This repository contains a demonstrative implementation for pooling-based models, e.g., DeepPyramidion complementing our paper "Sparsifyi…☆14Updated 2 years ago
- A tiny BERT for low-resource monolingual models☆31Updated 3 months ago
- Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆110Updated last month
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆77Updated 4 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆151Updated 7 months ago
- Bicleaner fork that uses neural networks☆39Updated 5 months ago
- Sentence transformers models for SpaCy☆107Updated last year
- This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences fro…☆159Updated 3 months ago
- German Alpaca Dataset (Cleaned + Translated)☆23Updated last year
- A french sequence to sequence pretrained model☆57Updated 2 years ago
- ☆44Updated 5 months ago
- Reduce the size of pretrained Hugging Face models via vocabulary trimming.☆43Updated 2 years ago
- Multilingual sentence alignment using sentence embeddings☆106Updated 2 months ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆50Updated this week
- OpusFilter - Parallel corpus processing toolkit☆104Updated this week
- NTREX -- News Test References for MT Evaluation☆80Updated 7 months ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated 10 months ago
- xfspell — the Transformer Spell Checker☆188Updated 4 years ago
- Python-based implementation of the Translate-Align-Retrieve method to automatically translate the SQuAD Dataset to Spanish.☆59Updated 2 years ago
- Simply, faster, sentence-transformers☆139Updated 4 months ago
- 📝An easy-to-use package to restore punctuation of the text.☆111Updated last year
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".☆98Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated last year
- Wikipedia text corpus for self-supervised NLP model training☆41Updated 2 years ago
- Improved Sentence Alignment in Linear Time and Space☆163Updated last year
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.☆72Updated last year
- BERT and ELECTRA models trained on Europeana Newspapers☆37Updated 3 years ago
- ☆136Updated 10 months ago