dedupeio / fuzzycategory
Fuzzy Categorical Distances
☆14Updated 4 years ago
Alternatives and similar repositories for fuzzycategory:
Users that are interested in fuzzycategory are comparing it to the libraries listed below
- Dedupe/batch geocode addresses and venues around the world with libpostal☆81Updated 3 years ago
- A maximum-strength name parser for record linkage.☆36Updated 5 months ago
- A repository for the "Combining DBpedia and Topic Modeling" GSoC 2016 idea☆13Updated 8 years ago
- Scalable String Similarity Joins in Python☆38Updated 6 months ago
- Hidden alignment conditional random field for classifying string pairs.☆25Updated 3 months ago
- Python binding for gumbo-parser using Cython☆14Updated 8 years ago
- Streaming newline delimited JSON I/O.☆12Updated last year
- Python wrapper for a C++ Double Metaphone☆15Updated last year
- A tool to read CSV files with CSVW metadata and transform them into other formats.☆32Updated 5 years ago
- Modularly extensible semantic metadata validator☆83Updated 9 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- An index data structure for approximate string search.☆23Updated 5 years ago
- a set of services that provide NLP facilities☆25Updated 4 years ago
- (Archived) A Python library for record linkage and deduplication.☆19Updated 10 months ago
- View a list of JSON-serializable dictionaries or a 2-D array, in HandsOnTable, in Jupyter Notebook.☆13Updated 6 years ago
- Utilities for working with data.☆20Updated 9 years ago
- A Cython implementation of the affine gap string distance☆58Updated last year
- Algorithms for "schema matching"☆25Updated 8 years ago
- Search 'from' and 'to' strings to learn a text cleaning mapping☆17Updated 9 years ago
- code and slides for my PyGotham 2016 talk, "Higher-level Natural Language Processing with textacy"☆15Updated 8 years ago
- Traptor -- A distributed Twitter feed☆26Updated 2 years ago
- Parser and standardizer for politician, individual and organization names.☆129Updated 7 years ago
- CSV inspection☆10Updated 2 years ago
- A Singer.io Target for the Stitch Import API☆26Updated this week
- An advanced yet simple system to run your background tasks and workflows☆20Updated 7 years ago
- Generate SQL tables, load and extract data, based on JSON Table Schema descriptors.☆62Updated last year
- Inspect a URL and estimate if it contains a news story☆39Updated last month
- Postgresql utilities for ETL and data analysis☆24Updated 7 years ago
- ☆16Updated 4 months ago