uds-lsv / Noisy-Channel-Spell-Checker
A tool for correcting misspellings in textual input using the Noisy Channel Model.
☆11Updated 4 years ago
Alternatives and similar repositories for Noisy-Channel-Spell-Checker:
Users that are interested in Noisy-Channel-Spell-Checker are comparing it to the libraries listed below
- BERT and ELECTRA models trained on Europeana Newspapers☆37Updated 3 years ago
- ☆17Updated last year
- Learning BPE embeddings by first learning a segmentation model and then training word2vec☆19Updated 2 years ago
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆15Updated 7 months ago
- Dataiku DSS plugin to detect languages, correct misspellings, and clean text data 🧼☆22Updated 2 months ago
- python package for calculating famous measures in computational linguistics☆13Updated 4 months ago
- A simple neural truecaser written in pytorch and allennlp.☆33Updated 9 months ago
- Generic Environment for Context-Aware Correction of Orthography☆22Updated 2 years ago
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation☆14Updated 7 months ago
- GC4LM: A Colossal (Biased) language model for German☆13Updated 3 years ago
- Unicode Standard tokenization routines and orthography profile segmentation☆35Updated last month
- OCR-D post-correction module based on weighted finite-state transducers☆11Updated last year
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler☆23Updated 4 years ago
- Tool to fix bitexts and tag near-duplicates for removal☆30Updated last month
- An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For inst…☆22Updated 3 years ago
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Updated last year
- A tiny BERT for low-resource monolingual models☆31Updated 6 months ago
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of …☆61Updated 4 years ago
- Tool for parsing and converting various span encoding schemes.☆23Updated last year
- zero-vocab or low-vocab embeddings☆18Updated 2 years ago
- Spell checker using Brill and Moore's noisy channel error model☆11Updated 6 years ago
- Multilingual Open Text☆25Updated 5 months ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆19Updated 2 years ago
- PALI: Language identification for Perso-Arabic Scripts☆9Updated last year
- Breaks a word into syllables using an LSTM-based neural network.☆19Updated last year
- UniParse: A universal graph-based parsing toolkit☆10Updated 5 years ago
- ☆64Updated 2 years ago
- A set of methods for finding an appropriate number of topics in a text collection☆15Updated last week
- docker for HF wav2vec2-sprint☆13Updated 4 years ago