jamesturk / jellyfish
πͺΌ a python library for doing approximate and phonetic matching of strings.
β2,070Updated 3 weeks ago
Related projects β
Alternatives and complementary repositories for jellyfish
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarityβ1,264Updated 3 years ago
- extract text from any document. no muss. no fuss.β3,912Updated this week
- NLP, before and after spaCyβ2,217Updated last year
- Port of Google's language-detection library to Python.β1,728Updated 9 months ago
- Fuzzy String Matching in Pythonβ9,230Updated last year
- Find dates inside text using Python and get back datetime objectsβ635Updated 6 months ago
- Fixes mojibake and other glitches in Unicode text, after the fact.β3,818Updated 3 weeks ago
- python parser for human readable datesβ2,561Updated last week
- spellchecking library for pythonβ601Updated 5 months ago
- π Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.β3,396Updated 2 months ago
- Utils for streaming large files (S3, HDFS, gzip, bz2...)β3,215Updated 3 weeks ago
- Fuzzy String Matching in Pythonβ2,901Updated 8 months ago
- Correctly generate plurals, ordinals, indefinite articles; convert numbers to wordsβ976Updated 3 weeks ago
- Rapid fuzzy string matching in Python using various string metricsβ2,735Updated this week
- Tika-Python is a Python binding to the Apache Tikaβ’ REST services allowing Tika to be called natively in the Python community.β1,511Updated 7 months ago
- Computing with Python functions.β3,881Updated last week
- Python bindings to libpostal for fast international address parsing/normalizationβ767Updated 4 months ago
- A powerful and modular toolkit for record linkage and duplicate detection in Pythonβ968Updated 9 months ago
- π¦ Contextually-keyed word vectorsβ1,626Updated 8 months ago
- Python datetimes made easyβ6,256Updated 3 weeks ago
- Python implementation of TextRank algorithms ("textgraphs") for phrase extractionβ2,152Updated 4 months ago
- A collection of common regular expressions bundled with an easy to use interface.β1,570Updated last year
- Parse human-readable date/time stringsβ695Updated last month
- Datetimes for Humansβ’β3,409Updated 4 months ago
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.β4,151Updated this week
- a python library for parsing unstructured United States address strings into address componentsβ1,534Updated last month
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithmβ¦β802Updated this week
- Multilingual text (NLP) processing toolkitβ2,317Updated last year
- serialize all of Pythonβ2,277Updated 3 weeks ago
- βοΈ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! βοΈβ1,918Updated 2 weeks ago