Simplifies use of the Dedupe library via Pandas
☆135Mar 30, 2023Updated 3 years ago
Alternatives and similar repositories for pandas-dedupe
Users that are interested in pandas-dedupe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.☆4,452Jul 29, 2025Updated 8 months ago
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,049Feb 21, 2024Updated 2 years ago
- Examples for using the dedupe library☆418Aug 10, 2024Updated last year
- (Archived) A Python library for record linkage and deduplication.☆19Mar 19, 2024Updated 2 years ago
- Record Linkage ToolKit (Find and link entities)☆111Aug 14, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- A list of free data matching and record linkage software.☆403Feb 21, 2024Updated 2 years ago
- A browser user interface for manual labeling of record pairs.☆48Jun 23, 2023Updated 2 years ago
- Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4☆286Aug 9, 2022Updated 3 years ago
- Command line tool for deduplicating CSV files☆434Mar 31, 2020Updated 6 years ago
- Fuzzy string matching, grouping, and evaluation.☆795Jul 10, 2025Updated 9 months ago
- A collection of python utility functions☆11Mar 30, 2026Updated last week
- Julia enum made nicer☆10May 28, 2020Updated 5 years ago
- PyPSA-DE: High resolution, sector-coupled model of the German Energy System☆42Updated this week
- R Evolved Generalized Software for Sampling Estimates and Errors in Surveys☆15Oct 3, 2025Updated 6 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…☆161Nov 18, 2022Updated 3 years ago
- ☆12Jun 3, 2023Updated 2 years ago
- A patient matching test harness to support PCOR☆16Feb 28, 2017Updated 9 years ago
- Connecting Conference Organizers and Speakers since 201x☆11Sep 16, 2016Updated 9 years ago
- Super Fast String Matching in Python☆370Mar 14, 2025Updated last year
- Dashboards showing intrinsic meta data for the OMOP-CDM databases in the EHDEN data network☆14Feb 12, 2026Updated 2 months ago
- Pandas in black and white: a collection of opinionated pandas flashcards☆14Feb 15, 2019Updated 7 years ago
- HHS HCC SQL Risk Score Model☆13Feb 13, 2026Updated last month
- Example Multi-Cycle, Multi-Touch Revenue and Cost Attribution Model☆34Feb 16, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Table Enforcer is my attempt to apply a sort of "test driven development" workflow to data cleaning and validation. A python package to f…☆19Feb 26, 2018Updated 8 years ago
- Dedupe/batch geocode addresses and venues around the world with libpostal☆84Nov 29, 2021Updated 4 years ago
- Locality-sensitive hashing in PySpark.☆27Mar 11, 2015Updated 11 years ago
- A simple command line interface to the datamade/dedupe library.☆43Dec 26, 2022Updated 3 years ago
- Optimization of flexibility options for transmission grids based on PyPSA☆43Updated this week
- micro-library to produce a couple of basic, attractive, printable plots with matplotlib☆11Mar 4, 2018Updated 8 years ago
- Open modelling of European power systems in Python: a proof-of-concept☆41Jan 13, 2023Updated 3 years ago
- 🏗️ Create APIs from CSV files within seconds, using fastapi☆81May 28, 2021Updated 4 years ago
- Library for unit extraction - fork of quantulum for python3☆144Mar 30, 2026Updated last week
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- AutoGen data analysis☆33Feb 19, 2024Updated 2 years ago
- This library was created in order to evaluate the effectiveness of any kind of algorithm used in IR systems and analyze how well they per…☆15Apr 15, 2020Updated 5 years ago
- A pytorch implementation of "SuperTML: Two-Dimensional Word Embedding for the Precognition on Structured Tabular Data"☆29Jul 23, 2019Updated 6 years ago
- Python bindings to libpostal for fast international address parsing/normalization☆870Nov 1, 2025Updated 5 months ago
- Python script for matching a list of messy addresses against a gazetteer using dedupe.☆64Mar 31, 2020Updated 6 years ago
- A curated list of ML awesome frameworks & libraries for text data☆17Mar 14, 2023Updated 3 years ago
- Combination of the RapidFuzz library with Spacy PhraseMatcher☆11Sep 29, 2021Updated 4 years ago