jamesturk / jellyfishLinks
πͺΌ a python library for doing approximate and phonetic matching of strings.
β2,187Updated last month
Alternatives and similar repositories for jellyfish
Users that are interested in jellyfish are comparing it to the libraries listed below
Sorting:
- Fixes mojibake and other glitches in Unicode text, after the fact.β4,011Updated last year
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarityβ1,277Updated 4 years ago
- python parser for human readable datesβ2,778Updated this week
- Find dates inside text using Python and get back datetime objectsβ666Updated last year
- Python bindings to libpostal for fast international address parsing/normalizationβ866Updated 3 months ago
- A simple Python module for parsing human names into their individual componentsβ701Updated last year
- NLP, before and after spaCyβ2,232Updated 2 years ago
- π Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.β3,513Updated 9 months ago
- A collection of common regular expressions bundled with an easy to use interface.β1,584Updated 2 years ago
- spellchecking library for pythonβ618Updated 4 months ago
- Tika-Python is a Python binding to the Apache Tikaβ’ REST services allowing Tika to be called natively in the Python community.β1,641Updated 9 months ago
- Correctly generate plurals, ordinals, indefinite articles; convert numbers to wordsβ1,060Updated 8 months ago
- Multilingual text (NLP) processing toolkitβ2,362Updated 2 years ago
- A powerful and modular toolkit for record linkage and duplicate detection in Pythonβ1,044Updated last year
- extract text from any document. no muss. no fuss.β4,428Updated last year
- a python library for parsing unstructured United States address strings into address componentsβ1,614Updated 5 months ago
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.β4,430Updated 6 months ago
- A toolkit for making domain-specific probabilistic parsersβ804Updated last year
- Parse human-readable date/time stringsβ710Updated 3 months ago
- Utils for streaming large files (S3, HDFS, gzip, bz2...)β3,429Updated last week
- Port of Google's language-detection library to Python.β1,870Updated 11 months ago
- a python library for parsing unstructured western names into name components.β616Updated 8 months ago
- Computing with Python functions.β4,321Updated 3 weeks ago
- Python implementation of TextRank algorithms ("textgraphs") for phrase extractionβ2,208Updated this week
- serialize all of Pythonβ2,426Updated last week
- Rapid fuzzy string matching in Python using various string metricsβ3,702Updated last week
- python humanize functionsβ1,697Updated 3 years ago
- π¦ Contextually-keyed word vectorsβ1,670Updated 9 months ago
- Heuristic based boilerplate removal toolβ811Updated 11 months ago
- Flatten JSON in Pythonβ553Updated 2 years ago