Simplifies use of the Dedupe library via Pandas
☆136Mar 30, 2023Updated 3 years ago
Alternatives and similar repositories for pandas-dedupe
Users that are interested in pandas-dedupe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.☆4,485Jul 29, 2025Updated 11 months ago
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,055Feb 21, 2024Updated 2 years ago
- Examples for using the dedupe library☆417Aug 10, 2024Updated last year
- Flow and transmission cost allocation in power systems☆17Jul 19, 2023Updated 2 years ago
- Record Linkage ToolKit (Find and link entities)☆112Aug 14, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- The http://analyticsdojo.com open source codebase and curriculum. Learn to data science today.☆38Dec 13, 2016Updated 9 years ago
- A list of free data matching and record linkage software.☆406Feb 21, 2024Updated 2 years ago
- A browser user interface for manual labeling of record pairs.☆48Jun 23, 2023Updated 3 years ago
- Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4☆286Aug 9, 2022Updated 3 years ago
- Introduction to the new generation of python dataviz tools☆20Feb 7, 2021Updated 5 years ago
- Resources for tackling record linkage / deduplication / data matching problems☆127Feb 22, 2024Updated 2 years ago
- Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends☆2,225Jun 25, 2026Updated last week
- ☆27Sep 9, 2021Updated 4 years ago
- Fuzzy string matching, grouping, and evaluation.☆798Jul 10, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A collection of python utility functions☆11May 8, 2026Updated last month
- R Evolved Generalized Software for Sampling Estimates and Errors in Surveys☆16Oct 3, 2025Updated 9 months ago
- PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…☆161Nov 18, 2022Updated 3 years ago
- Super Fast String Matching in Python☆372Jun 22, 2026Updated last week
- Dashboards showing intrinsic meta data for the OMOP-CDM databases in the EHDEN data network☆14Feb 12, 2026Updated 4 months ago
- Pandas in black and white: a collection of opinionated pandas flashcards☆14Feb 15, 2019Updated 7 years ago
- Edit CSV files using a table editor☆23Jul 9, 2021Updated 4 years ago
- Streamlit component for the Yellowbrick visualization and model diagnostics library☆12Jul 8, 2021Updated 4 years ago
- VSCode Extension integrating Editor with IPython console.☆19Mar 2, 2026Updated 4 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- PyPSA MCP: PyPSA Energy Modeling for LLMs☆55Mar 20, 2026Updated 3 months ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆63Jun 23, 2026Updated last week
- Table Enforcer is my attempt to apply a sort of "test driven development" workflow to data cleaning and validation. A python package to f…☆19Feb 26, 2018Updated 8 years ago
- Dedupe/batch geocode addresses and venues around the world with libpostal☆84Nov 29, 2021Updated 4 years ago
- 🔎 Finds fuzzy matches between CSV files☆190Mar 26, 2025Updated last year
- A simple command line interface to the datamade/dedupe library.☆43Dec 26, 2022Updated 3 years ago
- ☆11Apr 2, 2021Updated 5 years ago
- Optimization of flexibility options for transmission grids based on PyPSA☆46Jun 11, 2026Updated 3 weeks ago
- Open modelling of European power systems in Python: a proof-of-concept☆41Jan 13, 2023Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Snakemake with pytest example☆11Jul 18, 2020Updated 5 years ago
- AutoGen data analysis☆33Feb 19, 2024Updated 2 years ago
- Learning rust by playing with a HL7 parser. Use for real at your own risk!☆20Jul 28, 2025Updated 11 months ago
- Python bindings to libpostal for fast international address parsing/normalization☆880Nov 1, 2025Updated 8 months ago
- Python script for matching a list of messy addresses against a gazetteer using dedupe.☆64Mar 31, 2020Updated 6 years ago
- A curated list of ML awesome frameworks & libraries for text data☆17Mar 14, 2023Updated 3 years ago
- Optimal Wind+Hydrogen+Other+Battery+Solar (WHOBS) electricity systems for European countries☆52Oct 11, 2018Updated 7 years ago