troshko111 / fast-fuzzy-matchingLinks
BK-tree with Damerau-Levenshtein distance and Trie with Levenshtein distance
☆19Updated 8 years ago
Alternatives and similar repositories for fast-fuzzy-matching
Users that are interested in fast-fuzzy-matching are comparing it to the libraries listed below
Sorting:
- CRFSharp is Conditional Random Fields implemented by .NET(C#), a machine learning algorithm for learning from labeled sequences of exampl…☆122Updated 5 years ago
- Txt2Vec is a toolkit to represent text by vector. It's based on Google's word2vec project, but with some new features, such incremental t…☆68Updated 9 years ago
- ReactGraph is a library to make change propagation easy in .NET.☆63Updated 10 years ago
- JSON benchmark for .NET and Java☆72Updated 2 years ago
- ☆32Updated 9 years ago
- Fast approximate strings search & spelling correction☆58Updated 3 years ago
- A large-scale statistical machine translation system written in Java.☆212Updated 3 years ago
- Named Entity Extraction on Twitter Stream using Apache Spark Streaming and Stanford CoreNLP☆15Updated 8 years ago
- Json Wikipedia, contains code to convert the Wikipedia xml dump into a json/avro dump☆254Updated last year
- A fast and comprehensive Java library capable of performing automaton and non-automaton based Levenshtein distance determination and neig…☆43Updated 12 years ago
- Exploration Library in C#☆16Updated last year
- Java and .NET client interface for Pyro5 protocol☆184Updated 4 months ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 5 years ago
- An utility to randomize and split really huge (100+ GB) text files☆21Updated 8 years ago
- NLP tools developed by Emory University.☆61Updated 9 years ago
- Simple in-memory bitmap index written in C#☆15Updated 12 years ago
- Advanced Utility Libs☆24Updated 5 years ago
- Java text categorization system☆57Updated 8 years ago
- Memory-based shallow parser for Python☆74Updated 6 years ago
- OCRonet is optical character recognition (OCR) and document analysis system based on Convolutional Neural Networks (LeNet-5) and OCRopus.☆21Updated 6 years ago
- SymSpellCompound: compound aware automatic spelling correction☆65Updated 7 years ago
- An efficient and flexible token-based regular expression language and engine.☆75Updated 11 years ago
- Go-like DSL for C#☆50Updated 7 years ago
- Machine learning components for Apache UIMA☆131Updated 2 years ago
- HyperLogLog-based set cardinality estimation library☆95Updated 2 months ago
- Implementation of Aho-Corasick string matching algorithm for .NET☆31Updated 9 years ago
- Tools for working with wikidata (structured data from wikipedia)☆13Updated 9 years ago
- NLP toolkit (tokenizer, POS-tagger, parser, etc.)☆43Updated 8 years ago
- A text tagger based on Lucene / Solr, using FST technology☆177Updated last year
- A Utility Library for Wikipedia dumps☆33Updated 8 years ago