universal-automata / liblevenshtein
Various utilities regarding Levenshtein transducers.
☆67Updated 4 years ago
Alternatives and similar repositories for liblevenshtein:
Users that are interested in liblevenshtein are comparing it to the libraries listed below
- A simple proof of concept levenshtein automaton in Python☆109Updated 9 years ago
- Various utilities regarding Levenshtein transducers. (Java)☆57Updated 3 years ago
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆81Updated 6 years ago
- A dataset of popular pages (taken from <dir.yahoo.com>) with manually marked up semantic blocks.☆15Updated 11 years ago
- Ukb: graph-based WSD and similarity☆107Updated 8 months ago
- Framework for creating and accessing UBY resources – sense-linked lexical resources in standard UBY-LMF format☆22Updated 6 years ago
- Fast and robust NLP components implemented in Java.☆52Updated 4 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 5 years ago
- Linking Entities in CommonCrawl Dataset onto Wikipedia Concepts☆59Updated 12 years ago
- Software and resources for natural language processing.☆131Updated 8 years ago
- HAT-Trie for Python☆86Updated 9 years ago
- command-line tool to extract taxonomies from Wikidata☆126Updated 5 years ago
- WordNet RDF export☆25Updated 7 years ago
- A bunch of fancy soft string matching routines, with some accompanying datasets☆56Updated 7 years ago
- A tool for visualizing trees, tailored specifically to the analysis of parse trees.☆81Updated 4 years ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆67Updated last week
- A Utility Library for Wikipedia dumps☆33Updated 7 years ago
- C++ implementation of hamming distance algorithm HmSearch using Kyoto Cabinet☆42Updated 8 years ago
- Scripts and microservice to feed an ElasticSearch with Wikidata and Inventaire entities, and keep those up-to-date☆41Updated 4 years ago
- Semantic Web related concepts converted to Natural language☆44Updated 7 years ago
- TuffyLite is an open-source MLN inference engine that modifies the original Tuffy solver.☆27Updated 8 years ago
- A C++ library providing fast language model queries in compressed space.☆129Updated last year
- Pixy is a declarative vendor-independent graph query language built on the Tinkerpop software stack☆36Updated 3 years ago
- hyp: hypergraphs toolkit☆31Updated 8 years ago
- Build tables of information by extracting facts from indexed text corpora via a simple and effective query language.☆56Updated 5 years ago
- Hadoop jobs for WikiReverse project. Parses Common Crawl data for links to Wikipedia articles.☆38Updated 6 years ago
- Open-Source Information Retrieval Reproducibility Challenge☆50Updated 9 years ago
- Search for similar short strings☆53Updated 4 years ago
- PDF Extraction Toolkit☆41Updated 4 years ago