julesjacobs / levenshtein
A simple proof of concept levenshtein automaton in Python
☆109Updated 9 years ago
Alternatives and similar repositories for levenshtein:
Users that are interested in levenshtein are comparing it to the libraries listed below
- Finite state dictionaries in Java☆130Updated 3 years ago
- Various utilities regarding Levenshtein transducers.☆68Updated 4 years ago
- Open-Source Information Retrieval Reproducibility Challenge☆50Updated 9 years ago
- Implementations of various fast parallelized samplers for LDA, including Partially Collapsed LDA, Light LDA, Partially Collapsed Light LD…☆27Updated 2 years ago
- Roaring Bitmap in Cython☆81Updated 11 months ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 5 years ago
- Simhashing in C++☆132Updated 2 years ago
- C++ Ternary Search Tree implementation with Python bindings☆43Updated 7 years ago
- Fast directed acyclic word graph generator☆91Updated 6 years ago
- A C++ library providing fast language model queries in compressed space.☆129Updated 2 years ago
- Locality-sensitive hashing algorithm for text similarity comparisons☆58Updated 3 weeks ago
- Compact Data Structures Library☆124Updated 10 years ago
- Learning M-Way Tree - Web Scale Clustering - EM-tree, K-tree, k-means, TSVQ, repeated k-means, bitwise clustering☆74Updated 3 years ago
- A tool to segment text based on frequencies and the Viterbi algorithm "#TheBoyWhoLived" => ['#', 'The', 'Boy', 'Who', 'Lived']☆82Updated 9 years ago
- DAFSA-based dictionary-like read-only objects for Python. Based on `dawgdic` C++ library.☆301Updated 10 months ago
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆81Updated 6 years ago
- Implementation of Burkhard-Keller trees in various languages☆52Updated 15 years ago
- EliasFanoCompression: quasi-succinct compression of sorted integers in C#☆45Updated 3 years ago
- An inverted trigram index for accelerated string matching in Sqlite.☆77Updated 11 years ago
- A Utility Library for Wikipedia dumps☆33Updated 8 years ago
- Keyvi - a key value index that powers Cliqz search engine. It is an in-memory FST-based data structure highly optimized for size and look…☆177Updated 6 years ago
- *Deprecated* A fast and accurate part-of-speech tagger for TextBlob.☆102Updated 9 years ago
- Fast and robust NLP components implemented in Java.☆52Updated 4 years ago
- Suite of universal indexes for Highly Repetitive Document Collections☆20Updated 4 years ago
- Build tables of information by extracting facts from indexed text corpora via a simple and effective query language.☆56Updated 6 years ago
- The WikiBrain Java library enables researchers and developers to incorporate state-of-the-art Wikipedia-based algorithms and technologies…☆93Updated 6 years ago
- A collection of succinct data structures☆201Updated last year
- SociaLite: query language for large-scale graph analysis and data mining☆109Updated 8 years ago
- Code for "Performance shootout between nearest-neighbour libraries": http://radimrehurek.com/2013/11/performance-shootout-of-nearest-neig…☆99Updated 9 years ago
- NLP tools developed by Emory University.☆60Updated 8 years ago