richmilne / JaroWinkler
Original, standard and customisable versions of the Jaro-Winkler functions.
☆31Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for JaroWinkler
- An index data structure for approximate string search.☆23Updated 5 years ago
- Scalable String Similarity Joins in Python☆39Updated 4 months ago
- A Python implementation of the Metaphone and Double Metaphone algorithms☆80Updated 8 months ago
- Multi-Langauge Identification☆28Updated 3 months ago
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆67Updated last week
- A disk-based key/value store in Python with no dependencies.☆21Updated 9 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated last year
- Set-oriented Operations in Pandas☆24Updated 4 years ago
- Ensemble topic modeling with matrix factorization☆23Updated 6 years ago
- The most basic Text::Unidecode port (licensed under Artistic License or GPL or GPLv2+ - choose whatever you want)☆64Updated last year
- 🧬 A VS Code extension for annotating data with Prodigy☆30Updated 2 years ago
- ☆70Updated last year
- ☆29Updated 2 years ago
- Find which links on a web page are pagination links☆29Updated 7 years ago
- This is an Object Oriented implementation of a Trie in python. The class contains setter and getter methods, and implements several usefu…☆14Updated 6 years ago
- A compound word splitter for Python☆48Updated 3 years ago
- Guess gender from first name in Python 2 and 3☆129Updated 2 years ago
- 💥 Cython hash tables that assume keys are pre-hashed☆82Updated last year
- A simple fuzzy matching set for python strings☆223Updated 2 months ago
- Price and currency parsing utility☆26Updated last year
- A maximum-strength name parser for record linkage.☆32Updated 3 months ago
- Hidden alignment conditional random field for classifying string pairs.☆25Updated last month
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 3 years ago
- ☆50Updated last year
- Efficient string matching with regular expressions☆138Updated this week
- Annotation Management for Prodigy, that support multiple users working in many projects☆15Updated 5 years ago
- Python bindings for the Google's FarmHash☆37Updated 2 months ago
- Language detection extension for spaCy 2.0+☆111Updated 5 years ago