julesjacobs / levenshtein
A simple proof of concept levenshtein automaton in Python
☆107Updated 8 years ago
Related projects: ⓘ
- Various utilities regarding Levenshtein transducers.☆67Updated 3 years ago
- HAT-Trie for Python☆86Updated 8 years ago
- Roaring Bitmap in Cython☆79Updated 4 months ago
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆80Updated 6 years ago
- Code for "Performance shootout between nearest-neighbour libraries": http://radimrehurek.com/2013/11/performance-shootout-of-nearest-neig…☆100Updated 9 years ago
- Python API for Various DB-Backed Simhash Clusters☆63Updated 7 years ago
- Keyvi - a key value index that powers Cliqz search engine. It is an in-memory FST-based data structure highly optimized for size and look…☆179Updated 5 years ago
- Open-Source Information Retrieval Reproducibility Challenge☆50Updated 8 years ago
- DAFSA-based dictionary-like read-only objects for Python. Based on `dawgdic` C++ library.☆299Updated 3 months ago
- A fast Python implementation of locality sensitive hashing.☆70Updated 9 years ago
- embedded graph datastore☆185Updated 6 years ago
- Forever incomplete suite of tools for an orthographic/grammatical checker☆27Updated 4 years ago
- Locality-sensitive hashing algorithm for text similarity comparisons☆58Updated 2 years ago
- *Deprecated* A fast and accurate part-of-speech tagger for TextBlob.☆104Updated 8 years ago
- A fast and comprehensive Java library capable of performing automaton and non-automaton based Levenshtein distance determination and neig…☆41Updated 11 years ago
- ☆50Updated 3 years ago
- Fast directed acyclic word graph generator☆89Updated 6 years ago
- Simhashing in C++☆132Updated last year
- C++ Ternary Search Tree implementation with Python bindings☆43Updated 6 years ago
- An efficient and flexible token-based regular expression language and engine.☆74Updated 10 years ago
- A Utility Library for Wikipedia dumps☆33Updated 7 years ago
- [NO LONGER MAINTAINED AS OPEN SOURCE - USE SCALETEXT.COM INSTEAD]☆109Updated 11 years ago
- A toolkit that wraps various natural language processing implementations behind a common interface.☆101Updated 6 years ago
- Implementation of Burkhard-Keller trees in various languages☆52Updated 14 years ago
- Trinity IR Infrastructure☆235Updated 4 years ago
- Implementations of various fast parallelized samplers for LDA, including Partially Collapsed LDA, Light LDA, Partially Collapsed Light LD…☆26Updated last year
- A C++ library providing fast language model queries in compressed space.☆128Updated last year
- My implementation of Explicit Semantic Analysis (ESA) library that we used at KMi, Open University to produce our submission at the NTCIR…☆36Updated 8 years ago
- NLP tools developed by Emory University.☆60Updated 8 years ago
- A bunch of fancy soft string matching routines, with some accompanying datasets☆54Updated 7 years ago