jamesturk / jellyfish
πͺΌ a python library for doing approximate and phonetic matching of strings.
β2,130Updated 3 weeks ago
Alternatives and similar repositories for jellyfish:
Users that are interested in jellyfish are comparing it to the libraries listed below
- Fuzzy String Matching in Pythonβ9,257Updated 2 years ago
- Fuzzy String Matching in Pythonβ3,194Updated 2 months ago
- python parser for human readable datesβ2,648Updated last month
- Rapid fuzzy string matching in Python using various string metricsβ3,064Updated last week
- extract text from any document. no muss. no fuss.β4,112Updated 5 months ago
- A simple Python module for parsing human names into their individual componentsβ673Updated 11 months ago
- Fixes mojibake and other glitches in Unicode text, after the fact.β3,903Updated 6 months ago
- π Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.β3,461Updated 2 weeks ago
- The Levenshtein Python C extension module contains functions for fast computation of Levenshtein distance and string similarityβ1,274Updated 3 years ago
- a python library for parsing unstructured western names into name components.β606Updated 6 months ago
- A toolkit for making domain-specific probabilistic parsersβ800Updated 7 months ago
- Tika-Python is a Python binding to the Apache Tikaβ’ REST services allowing Tika to be called natively in the Python community.β1,580Updated 3 weeks ago
- Python implementation of TextRank algorithms ("textgraphs") for phrase extractionβ2,174Updated 9 months ago
- Port of Google's language-detection library to Python.β1,791Updated 2 months ago
- Utils for streaming large files (S3, HDFS, gzip, bz2...)β3,304Updated last month
- A powerful and modular toolkit for record linkage and duplicate detection in Pythonβ1,001Updated last year
- Python character encoding detectorβ2,248Updated 3 months ago
- Computing with Python functions.β4,050Updated this week
- Useful extensions to the standard Python datetime featuresβ2,451Updated last month
- NLP, before and after spaCyβ2,225Updated last year
- Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.β2,538Updated 8 months ago
- Python port of SymSpell: 1 million times faster spelling correction & fuzzy search through Symmetric Delete spelling correction algorithmβ¦β824Updated last week
- Convert HTML to Markdown-formatted text.β1,978Updated 3 weeks ago
- Parse strings using a specification based on the Python format() syntax.β1,744Updated 4 months ago
- βοΈ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! βοΈβ1,994Updated 3 months ago
- spellchecking library for pythonβ609Updated 10 months ago
- pyxDamerauLevenshtein implements the Damerau-Levenshtein (DL) edit distance algorithm for Python in Cython for high performance.β247Updated last year
- Find dates inside text using Python and get back datetime objectsβ652Updated 11 months ago
- Python dictionaries with advanced dot notation accessβ2,722Updated this week
- ASCII transliterations of Unicode text - GitHub mirrorβ560Updated last week