julesjacobs / levenshtein
A simple proof of concept levenshtein automaton in Python
☆107Updated 9 years ago
Alternatives and similar repositories for levenshtein:
Users that are interested in levenshtein are comparing it to the libraries listed below
- Finite state dictionaries in Java☆130Updated 2 years ago
- HAT-Trie for Python☆86Updated 8 years ago
- DAFSA-based dictionary-like read-only objects for Python. Based on `dawgdic` C++ library.☆300Updated 7 months ago
- Open-Source Information Retrieval Reproducibility Challenge☆50Updated 9 years ago
- Roaring Bitmap in Cython☆79Updated 8 months ago
- Various utilities regarding Levenshtein transducers.☆67Updated 4 years ago
- *Deprecated* A fast and accurate part-of-speech tagger for TextBlob.☆102Updated 9 years ago
- Learning M-Way Tree - Web Scale Clustering - EM-tree, K-tree, k-means, TSVQ, repeated k-means, bitwise clustering☆75Updated 2 years ago
- Code for "Performance shootout between nearest-neighbour libraries": http://radimrehurek.com/2013/11/performance-shootout-of-nearest-neig…☆99Updated 9 years ago
- Implementation of Bayesian Sets for fast similarity searches.☆15Updated 13 years ago
- A fast Python implementation of locality sensitive hashing.☆70Updated 9 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 4 years ago
- Golomb Coded Sets☆91Updated 7 years ago
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆51Updated 7 years ago
- Extract a plain text corpus from MediaWiki XML dumps, such as Wikipedia.☆132Updated 6 years ago
- A tool to segment text based on frequencies and the Viterbi algorithm "#TheBoyWhoLived" => ['#', 'The', 'Boy', 'Who', 'Lived']☆82Updated 8 years ago
- Keyvi - a key value index that powers Cliqz search engine. It is an in-memory FST-based data structure highly optimized for size and look…☆178Updated 6 years ago
- A library of inverted index data structures☆148Updated 2 years ago
- A bunch of fancy soft string matching routines, with some accompanying datasets☆56Updated 7 years ago
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆81Updated 6 years ago
- Simhashing in C++☆134Updated last year
- An inverted trigram index for accelerated string matching in Sqlite.☆77Updated 10 years ago
- ☆50Updated 4 years ago
- ☆28Updated 9 years ago
- Sux4J is an effort to bring succinct data structures to Java.☆156Updated last year
- Hidden alignment conditional random field for classifying string pairs.☆37Updated 7 years ago
- Compilation and rule-based optimization framework for relational algebra. Raco is the language, optimization, and query translation layer…☆72Updated 6 years ago
- C++ implementation of hamming distance algorithm HmSearch using Kyoto Cabinet☆42Updated 8 years ago