fujimotos / mbleven
An efficient algorithm for k-bounded (Damerau-)Levenshtein distance
☆16Updated 6 years ago
Alternatives and similar repositories for mbleven:
Users that are interested in mbleven are comparing it to the libraries listed below
- Code and data from the paper "Email formality in the workplace: A case study on the Enron corpus"☆10Updated 9 years ago
- A tool for detecting sentence fragments.☆7Updated 8 years ago
- Dynamic weighted sampling with replacement☆14Updated 8 years ago
- allennlp + streamlit demo☆22Updated 5 years ago
- bin files☆13Updated last month
- GSDMM: Short text clustering (Rust implementation)☆23Updated last year
- Supporting example for "A Rust SentencePiece implementation"☆18Updated 4 years ago
- A workflow system for Natural Language Processing.☆21Updated 5 years ago
- brat rapid annotation tool (brat) - for all your textual annotation needs☆10Updated 7 years ago
- Memory-efficient Count-Min Sketch Counter (based on Madoka C++ library)☆26Updated 6 years ago
- Hidden alignment conditional random field for classifying string pairs.☆24Updated 5 months ago
- Official repository of Quickscorer: a fast algorithm to rank documents with additive ensembles of regression trees.☆18Updated 8 years ago
- A C++ library implementing fast language models estimation using the 1-Sort algorithm.☆17Updated last year
- Playing with arithmetic coding and RNNs☆22Updated 8 years ago
- Text readability metrics in Python.☆11Updated 11 years ago
- An author identification system based on recur☆21Updated 8 years ago
- A pure Python implementation of Aho-Corasick algorithm.☆22Updated 6 years ago
- Contains the main implementation of programs for the paper: Reproducing and learning new algebraic operations on word embeddings using ge…☆12Updated 8 years ago
- Official library of images for the SIGIR 2019 Open-Source IR Replicability Challenge (OSIRRC 2019)☆13Updated 5 years ago
- Implementation of Bayesian Sets for fast similarity searches.☆14Updated 13 years ago
- Yet another regression toolkit☆12Updated 11 years ago
- Anytime Ranking for Impact-Ordered Indexes☆12Updated 8 years ago
- Clustering documents based on LSH☆14Updated 8 years ago
- Short Text Similarity as described in https://dl.acm.org/citation.cfm?id=2806475☆16Updated 6 years ago
- Code and data for "Universal Approximation Functions for Fast Learning to Rank: Replacing Expensive Regression Forests with Simple Feed-F…☆9Updated 6 years ago
- An open-source NLP library: fast text cleaning and preprocessing☆23Updated 3 years ago
- My most frequently used learning-to-rank algorithms ported to rust for efficiency. Try it: "pip install fastrank".☆52Updated last week
- A DeepWalk implementation for ontologies using NetworkX and Gensim☆18Updated 7 years ago
- A dataset of popular pages (taken from <dir.yahoo.com>) with manually marked up semantic blocks.☆15Updated 11 years ago
- Python binding of cedar (implementation of efficiently-updatable double-array trie) using Cython☆17Updated 5 years ago