dedupeio / fuzzycategory
Fuzzy Categorical Distances
☆14Updated 4 years ago
Alternatives and similar repositories for fuzzycategory:
Users that are interested in fuzzycategory are comparing it to the libraries listed below
- A maximum-strength name parser for record linkage.☆36Updated last week
- Scalable String Similarity Joins in Python☆38Updated 7 months ago
- Hidden alignment conditional random field for classifying string pairs.☆24Updated 4 months ago
- A browser user interface for manual labeling of record pairs.☆44Updated last year
- Dedupe/batch geocode addresses and venues around the world with libpostal☆82Updated 3 years ago
- Algorithms for "schema matching"☆26Updated 8 years ago
- Python binding for gumbo-parser using Cython☆14Updated 8 years ago
- ☆30Updated 2 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated 2 years ago
- A Cython implementation of the affine gap string distance☆57Updated 2 years ago
- ☆13Updated 5 years ago
- A repository for the "Combining DBpedia and Topic Modeling" GSoC 2016 idea☆13Updated 8 years ago
- An index data structure for approximate string search.☆23Updated 5 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated 2 months ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆147Updated last month
- mltk - Moz Language Tool Kit☆12Updated 9 years ago
- Demo of Single-cell IPython webapps☆28Updated 11 years ago
- Python port for IWNLP.Lemmatizer☆17Updated last year
- A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any othe…☆67Updated 2 years ago
- Traptor -- A distributed Twitter feed☆26Updated 2 years ago
- Python package for deduplication/entity resolution using active learning☆76Updated 5 months ago
- Search 'from' and 'to' strings to learn a text cleaning mapping☆17Updated 9 years ago
- Embedded MonetDB with a Python frontend and fast Numpy/Pandas support☆61Updated 4 months ago
- Navigating around a grid of cells like XPath for spreadsheets; supports Python 3.5+☆47Updated 2 years ago
- Resources for tackling record linkage / deduplication / data matching problems☆117Updated 11 months ago
- Provide partial dates and retain the date precision through processing☆13Updated 2 years ago
- Language detection using Spacy and Fasttext☆55Updated last year
- Postgresql utilities for ETL and data analysis☆24Updated 7 years ago
- A disk-based key/value store in Python with no dependencies.☆21Updated 9 years ago