gpoulter / pydedupe
(Archived) A Python library for record linkage and deduplication.
☆19Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for pydedupe
- Generate Pandas frames, load and extract data, based on JSON Table Schema descriptors.☆52Updated 3 years ago
- Versioned domain model. Python library for revisioning/versioning of databases.☆44Updated 3 years ago
- Generate Elasticsearch indexes based on Table Schema descriptors.☆10Updated 3 years ago
- Streaming newline delimited JSON I/O.☆12Updated last year
- Python wrapper for a C++ Double Metaphone☆15Updated last year
- Utility library to turn country names into ISO two-letter codes☆66Updated this week
- A maximum-strength name parser for record linkage.☆34Updated 3 months ago
- ☆13Updated 5 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated last year
- Framework for processing data packages in pipelines of modular components.☆119Updated last year
- CSV on the web☆37Updated last month
- Search 'from' and 'to' strings to learn a text cleaning mapping☆17Updated 9 years ago
- A browser user interface for manual labeling of record pairs.☆41Updated last year
- A scrapy extension to store requests and responses information in storage service☆26Updated 2 years ago
- Extract, parse and populate templates from strings☆27Updated 5 years ago
- workflow support for reproducible deduplication and merging☆16Updated last year
- Python language parser for a tabular format for structured metadata. http://metatab.org☆17Updated last year
- Extends zip() and itertools.zip_longest() to generate named tuples.☆23Updated 5 years ago
- Dexter document monitor for MMA☆17Updated 6 months ago
- A Python library for defining rule-based overrides on messy data☆12Updated this week
- Enhance your feature engineering workflow with Kodiak☆20Updated last year
- An easy interface for documenting data packages☆19Updated 6 years ago
- International Address formatter which considers the standard formatting rules of the country☆26Updated 3 years ago
- Scalable String Similarity Joins in Python☆39Updated 4 months ago
- A Python package that simplifies the use of secrets in a Jupyter notebook☆21Updated 3 years ago
- Generate SQL tables, load and extract data, based on JSON Table Schema descriptors.☆61Updated last year
- ☆12Updated 7 years ago
- csvcat☆22Updated 8 years ago