julesjacobs / levenshtein
A simple proof of concept levenshtein automaton in Python
☆109Updated 9 years ago
Alternatives and similar repositories for levenshtein:
Users that are interested in levenshtein are comparing it to the libraries listed below
- Finite state dictionaries in Java☆130Updated 3 years ago
- Various utilities regarding Levenshtein transducers.☆67Updated 4 years ago
- A C++ library providing fast language model queries in compressed space.☆129Updated last year
- Open-Source Information Retrieval Reproducibility Challenge☆50Updated 9 years ago
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆81Updated 6 years ago
- Locality-sensitive hashing algorithm for text similarity comparisons☆58Updated 3 years ago
- HAT-Trie for Python☆86Updated 9 years ago
- A fast Python implementation of locality sensitive hashing.☆70Updated 9 years ago
- Implementation of Burkhard-Keller trees in various languages☆52Updated 14 years ago
- DAFSA-based dictionary-like read-only objects for Python. Based on `dawgdic` C++ library.☆300Updated 8 months ago
- A bunch of fancy soft string matching routines, with some accompanying datasets☆56Updated 7 years ago
- Implementation of Bayesian Sets for fast similarity searches.☆14Updated 13 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 5 years ago
- Language Detection with Infinity-gram☆231Updated 9 years ago
- A tool to segment text based on frequencies and the Viterbi algorithm "#TheBoyWhoLived" => ['#', 'The', 'Boy', 'Who', 'Lived']☆82Updated 8 years ago
- A library of inverted index data structures☆148Updated 2 years ago
- Fast and robust NLP components implemented in Java.☆52Updated 4 years ago
- ☆92Updated 9 years ago
- Trinity IR Infrastructure☆237Updated 5 years ago
- An inverted trigram index for accelerated string matching in Sqlite.☆77Updated 10 years ago
- Forever incomplete suite of tools for an orthographic/grammatical checker☆28Updated 5 years ago
- Code for "Performance shootout between nearest-neighbour libraries": http://radimrehurek.com/2013/11/performance-shootout-of-nearest-neig…☆99Updated 9 years ago
- EliasFanoCompression: quasi-succinct compression of sorted integers in C#☆45Updated 3 years ago
- Generalized Language Modeling toolkit☆51Updated 2 years ago
- Hidden alignment conditional random field for classifying string pairs.☆36Updated 7 years ago
- A toolkit that wraps various natural language processing implementations behind a common interface.☆101Updated 7 years ago
- Rewrite text in linear time.☆81Updated last year
- Non-Overlapping Aho-Corasick Python extension, for Python 2 (str and unicode) and Python 3☆51Updated 9 years ago
- Compute association strength over semantic networks in a dimensionality-reduced form.☆32Updated 9 years ago
- Sometimes you just need a lot of text. Plainstream is a small Python app that provides you with a plain text stream directly from Wikiped…☆24Updated last year