lingz / fast_fuzzy_search
Fast Fuzzy Phonetic Search algorithm in Python
☆14Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for fast_fuzzy_search
- Python Phonetic Tools and Distance Metrics☆12Updated 6 years ago
- A thin wrapper around the DBPedia Spotlight REST API☆58Updated 6 months ago
- Train NLTK punkt tokenizers☆50Updated 14 years ago
- http://www.dialog-21.ru/evaluation/2016/letter/☆56Updated 7 years ago
- A Python implementation of the Metaphone and Double Metaphone algorithms☆80Updated 8 months ago
- Python interface to http://opencorpora.org/☆45Updated 4 years ago
- Samsung Natural Language Processing Pipeline (basically for Russian language): morphology, dependency parser and much more☆59Updated 4 years ago
- Python bindings for libwapiti☆66Updated 4 years ago
- Python wrapper for aspell (C extension and python version)☆81Updated last year
- ☆48Updated 7 years ago
- A simple and fast rule-based sentence segmentation. Tested on OpenCorpora and SynTagRus datasets.☆53Updated 6 years ago
- DAFSA-based dictionary-like read-only objects for Python. Based on `dawgdic` C++ library.☆300Updated 5 months ago
- Russian mass media stemmed texts corpus / Корпус лемматизированных (морфологически нормализованных) текстов российских СМИ☆88Updated 7 years ago
- AdaGram (adaptive skip-gram) for Python☆74Updated 7 years ago
- Repository for ru-syntax command line tool.☆16Updated 2 years ago
- Custom Russian tokenizer for spaCy☆42Updated 5 years ago
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic fe…☆170Updated 2 years ago
- ☆34Updated 7 years ago
- Hunspell extension for spaCy 2.0.☆94Updated 3 months ago
- pyxDamerauLevenshtein implements the Damerau-Levenshtein (DL) edit distance algorithm for Python in Cython for high performance.☆243Updated 6 months ago
- ☆16Updated 6 years ago
- Morphological Analyzer for Russian 💬☆40Updated 3 years ago
- System for automatic pronominal resolution for Russian☆16Updated 4 years ago
- Code and data used in named entity transliteration experiments☆57Updated 6 years ago
- Clinical spelling correction with word and character n-gram embeddings.☆74Updated 2 years ago
- Watasense: an Unsupervised WSD System for Under-Resourced Languages.☆11Updated 2 years ago
- Pre-trained models for tokenization, sentence segmentation and so on☆15Updated 7 years ago
- Open Source framework for developing Dialog Agents☆20Updated 6 years ago
- Python wrapper around SVDLIBC, a fast library for sparse Singular Value Decomposition☆55Updated 11 years ago
- Tokenize English sentences using neural networks.☆64Updated 7 years ago