wsong / Typo-DistanceLinks
Finds the likelihood that one string is a typo of another and generates likely typos from a given string
☆61Updated 14 years ago
Alternatives and similar repositories for Typo-Distance
Users that are interested in Typo-Distance are comparing it to the libraries listed below
Sorting:
- Language Detection with Infinity-gram☆230Updated 10 years ago
- Fast Word Clustering Software☆79Updated last year
- SimString☆113Updated 4 years ago
- Open-Source Information Retrieval Reproducibility Challenge☆50Updated 10 years ago
- Python wrapper around SVDLIBC, a fast library for sparse Singular Value Decomposition☆55Updated 12 years ago
- Hidden alignment conditional random field for classifying string pairs.☆36Updated 8 years ago
- Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipg…☆129Updated this week
- An unsupervised compound splitter☆42Updated 6 years ago
- A tool to segment text based on frequencies and the Viterbi algorithm "#TheBoyWhoLived" => ['#', 'The', 'Boy', 'Who', 'Lived']☆81Updated 9 years ago
- ☆98Updated 4 years ago
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic fe…☆171Updated 4 years ago
- SymSpellCompound: compound aware automatic spelling correction☆65Updated 7 years ago
- framework for doing NER and other types of entity recognition, in Python☆68Updated 3 years ago
- ☆47Updated 9 years ago
- Fast supervised sentence boundary detection using the averaged perceptron☆91Updated 7 years ago
- Named Entity Recognition data for Europeana Newspapers☆173Updated 2 years ago
- A deep, LSTM-based part of speech tagger and sentiment analyser using character embeddings instead of words. Compatible with Theano and T…☆92Updated 8 years ago
- Python bindings to the Compact Language Detector☆33Updated 5 years ago
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Updated 9 years ago
- Socially-Equitable Language Identification☆78Updated 2 years ago
- ☆32Updated 5 years ago
- Fast and robust NLP components implemented in Java.☆53Updated 5 years ago
- Tutorial and review of word2vec / doc2vec☆104Updated 10 years ago
- My most frequently used learning-to-rank algorithms ported to rust for efficiency. Try it: "pip install fastrank".☆52Updated 11 months ago
- ☆23Updated last year
- A Utility Library for Wikipedia dumps☆33Updated 8 years ago
- Temporal Expression Recognition and Normalisation in Python☆77Updated 10 years ago
- Locality-sensitive hashing algorithm for text similarity comparisons☆59Updated 9 months ago
- Python scripts for retrieving CSV data from the Google Ngram Viewer and plotting it in XKCD style. The Python script for retrieving ngram…☆254Updated 5 years ago
- Framework for evaluating text extraction algorithms implemented as web services☆42Updated 13 years ago