wolfgarbe / WordSegmentationTMLinks
Fast Word Segmentation with Triangular Matrix
☆82Updated 4 years ago
Alternatives and similar repositories for WordSegmentationTM
Users that are interested in WordSegmentationTM are comparing it to the libraries listed below
Sorting:
- Fast approximate strings search & spelling correction☆59Updated 4 years ago
- SymSpellCompound: compound aware automatic spelling correction☆65Updated 7 years ago
- Word Segmentation with Dynamic Programming☆20Updated 4 years ago
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆81Updated 7 years ago
- CRFSharp is Conditional Random Fields implemented by .NET(C#), a machine learning algorithm for learning from labeled sequences of exampl…☆122Updated 5 years ago
- CUI-based Tree Visualizer for Universal Dependencies and Immediate Catena Analysis☆108Updated 2 months ago
- Extracts a latent knowledge graph from text and index/query it in elasticsearch or solr☆21Updated 3 years ago
- A schemaless graph database based on RocksDb☆46Updated 2 years ago
- Inverted file indexing and retrieval optimized for short texts. Supports auto-suggest and query segment classification.☆34Updated 2 years ago
- Next generation OCR engine based on LSTMs.☆52Updated 7 years ago
- BK-tree with Damerau-Levenshtein distance and Trie with Levenshtein distance☆19Updated 8 years ago
- A phonetic matching library. Includes text utilities to do string comparisons on phonemes (the sound of the string), as opposed to charac…☆162Updated 2 years ago
- A python library detect and extract listing data from HTML page.☆108Updated 8 years ago
- OCR using tesseract, ImageMagick, EmguCV, an advanced query language and a fluent query interface for C#☆75Updated 2 years ago
- ReconNER, Debug annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality of your data.☆35Updated 5 years ago
- 🦜 Containerized HTTP API for industrial-strength NLP via spaCy and sense2vec☆60Updated 4 years ago
- OCRonet is optical character recognition (OCR) and document analysis system based on Convolutional Neural Networks (LeNet-5) and OCRopus.☆21Updated 6 years ago
- An efficient data structure for fast string similarity searches☆22Updated 4 years ago
- Generator of rule-based lemmatizers (based on examples) for serveral European languages.☆29Updated 4 years ago
- Yet Another (natural language) Parser☆43Updated 6 years ago
- An utility to randomize and split really huge (100+ GB) text files☆21Updated 8 years ago
- Performance evaluation of nearest neighbor search using Vespa, Elasticsearch and Open Distro for Elasticsearch K-NN☆117Updated 4 years ago
- Neural Elastic Inference and Search☆19Updated 5 years ago
- 🚀GUI for training spaCy models☆55Updated 4 years ago
- Lexical database of any language☆184Updated 3 years ago
- Search for similar short strings☆53Updated 5 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆38Updated last year
- Smallest full text search engine (lucene replacement) built from scratch using inverted Roaring bitmap index, highly compact storage, ope…☆120Updated 5 years ago
- Vector Plugin for Solr: calculate dot product / cosine similarity on documents☆20Updated 4 years ago
- OpenNeuroSpell contains parts of NeuroSpell (http://neurospell.com/en.php) released as open-source. More code will be published as soon a…☆20Updated last year