taleinat / fuzzysearch
Find parts of long text or data, allowing for some changes/typos.
β312Updated 3 months ago
Related projects β
Alternatives and complementary repositories for fuzzysearch
- β165Updated 5 months ago
- π¦ Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)β435Updated 4 months ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)β149Updated last year
- pyxDamerauLevenshtein implements the Damerau-Levenshtein (DL) edit distance algorithm for Python in Cython for high performance.β243Updated 6 months ago
- Text tokenization and sentence segmentation (segtok v2)β203Updated 2 years ago
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacingβ67Updated 2 weeks ago
- Pure python Aho-Corasick library.β212Updated last year
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarityβ99Updated 3 weeks ago
- Pure Python Spell Checking http://pyspellchecker.readthedocs.io/en/latest/β713Updated 8 months ago
- Convert number words (eg. twenty one) to numeric digits (21)β169Updated last year
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithmβ¦β801Updated this week
- English word segmentation, written in pure-Python, and based on a trillion-word corpus.β365Updated last year
- A simple fuzzy matching set for python stringsβ223Updated 3 months ago
- A fast and memory-optimized string library for heavy-text manipulation in Pythonβ250Updated 4 years ago
- Lightning Fast Language Prediction πβ165Updated 5 years ago
- Fast Levenshtein Distance Library for Python 3β81Updated 2 years ago
- Textpipe: clean and extract metadata from textβ299Updated 3 years ago
- Super Fast String Matching in Pythonβ364Updated 6 months ago
- Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic feβ¦β170Updated 2 years ago
- Fast Autocomplete: When Elastcsearch suggestions are not fast and flexible enoughβ271Updated last year
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.β102Updated 2 weeks ago
- β450Updated 2 weeks ago
- Plac: Parsing the Command Line the Easy Wayβ296Updated 3 months ago
- Find strings/words in text; convenience and C speedβ126Updated 2 years ago
- βοΈContextual word checker for better suggestions (not actively maintained)β409Updated last month
- spellchecking library for pythonβ601Updated 5 months ago
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarityβ281Updated 3 weeks ago
- Python wrapper for Stanford CoreNLP's SUTimeβ153Updated last year
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiencyβ144Updated this week