uds-lsv / Noisy-Channel-Spell-CheckerLinks
A tool for correcting misspellings in textual input using the Noisy Channel Model.
☆11Updated 4 years ago
Alternatives and similar repositories for Noisy-Channel-Spell-Checker
Users that are interested in Noisy-Channel-Spell-Checker are comparing it to the libraries listed below
Sorting:
- Source code for the Apple reproduction☆32Updated 4 years ago
- A simple neural truecaser written in pytorch and allennlp.☆33Updated last year
- Text processing library for sentiment analysis and related tasks☆27Updated 6 years ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated 2 years ago
- Multilingual Open Text☆25Updated 2 months ago
- Code for paper "Joint Learning of Correlated Sequence Labelling Tasks Using Bidirectional Recurrent Neural Networks"☆11Updated 6 years ago
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler☆24Updated 4 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- Execute arbitrary SQL queries on 🤗 Datasets☆32Updated last year
- BERT models for many languages created from Wikipedia texts☆33Updated 5 years ago
- ☆15Updated 6 years ago
- Generic Environment for Context-Aware Correction of Orthography☆22Updated 2 years ago
- A toolkit for producing n-gram language models. The highlights are the implementation of Kneser-Ney growing and revised Kneser pruning me…☆40Updated 10 months ago
- Caucasus languages focused multilingual and monolingual corpuses for Natural Language Processing(NLP)☆35Updated 7 months ago
- Training BERT for punctuation task☆10Updated 4 years ago
- Breaks a word into syllables using an LSTM-based neural network.☆20Updated last year
- docker for HF wav2vec2-sprint☆13Updated 4 years ago
- Arabic Phonetic Dictionary Generator Tool for Automatic Speech Recognition Applications☆13Updated 3 years ago
- Temporary remove unused tokens during training to save ram and speed.☆24Updated last month
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.☆71Updated 2 years ago
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Updated 2 years ago
- Dataiku DSS plugin to detect languages, correct misspellings, and clean text data 🧼☆22Updated 5 months ago
- Code for pre-training CharacterBERT models (as well as BERT models).☆34Updated 3 years ago
- Featurize words into orthographic and phonological vectors.☆41Updated 2 years ago
- Library for fast text representation and classification.☆30Updated last year
- Converter from UD-trees to BART representation☆36Updated last year
- Repository with illustrations for cft-contest-2018☆12Updated 6 years ago
- 🎯 Speech Recognition Challenge by Speech Lab - IIT Madras☆11Updated 4 years ago
- n-gram language models☆14Updated last year