Simplifies use of the Dedupe library via Pandas
☆135Mar 30, 2023Updated 3 years ago
Alternatives and similar repositories for pandas-dedupe
Users that are interested in pandas-dedupe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,048Feb 21, 2024Updated 2 years ago
- Examples for using the dedupe library☆418Aug 10, 2024Updated last year
- Flow and transmission cost allocation in power systems☆17Jul 19, 2023Updated 2 years ago
- A list of free data matching and record linkage software.☆403Feb 21, 2024Updated 2 years ago
- A browser user interface for manual labeling of record pairs.☆48Jun 23, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- PyPSATopo is a tool that allows generating the topographical representation of any arbitrary PyPSA-based network☆25Dec 19, 2025Updated 4 months ago
- Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4☆286Aug 9, 2022Updated 3 years ago
- Introduction to the new generation of python dataviz tools☆20Feb 7, 2021Updated 5 years ago
- Resources for tackling record linkage / deduplication / data matching problems☆127Feb 22, 2024Updated 2 years ago
- Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends☆2,111Updated this week
- Command line tool for deduplicating CSV files☆434Mar 31, 2020Updated 6 years ago
- Fuzzy string matching, grouping, and evaluation.☆794Jul 10, 2025Updated 9 months ago
- PyPSA-DE: High resolution, sector-coupled model of the German Energy System☆44Updated this week
- A content-filtering bypass system developed specifically to allow access to trans-related resources on public networks (libraries, school…☆27Nov 15, 2014Updated 11 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- R Evolved Generalized Software for Sampling Estimates and Errors in Surveys☆16Oct 3, 2025Updated 7 months ago
- PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…☆161Nov 18, 2022Updated 3 years ago
- ☆12Jun 3, 2023Updated 2 years ago
- Connecting Conference Organizers and Speakers since 201x☆11Sep 16, 2016Updated 9 years ago
- Work for Mastering Large Datasets with Python☆20Dec 8, 2022Updated 3 years ago
- Super Fast String Matching in Python☆369Mar 14, 2025Updated last year
- Python/Django application comparing the Amadeus Self-Service Flight Offers Search with the Flight Choice Prediction APIs☆20Sep 24, 2024Updated last year
- Pandas in black and white: a collection of opinionated pandas flashcards☆14Feb 15, 2019Updated 7 years ago
- Edit CSV files using a table editor☆23Jul 9, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Streamlit component for the Yellowbrick visualization and model diagnostics library☆12Jul 8, 2021Updated 4 years ago
- HHS HCC SQL Risk Score Model☆13Feb 13, 2026Updated 2 months ago
- Table Enforcer is my attempt to apply a sort of "test driven development" workflow to data cleaning and validation. A python package to f…☆19Feb 26, 2018Updated 8 years ago
- Dedupe/batch geocode addresses and venues around the world with libpostal☆84Nov 29, 2021Updated 4 years ago
- Locality-sensitive hashing in PySpark.☆27Mar 11, 2015Updated 11 years ago
- 🔎 Finds fuzzy matches between CSV files☆191Mar 26, 2025Updated last year
- A simple command line interface to the datamade/dedupe library.☆43Dec 26, 2022Updated 3 years ago
- micro-library to produce a couple of basic, attractive, printable plots with matplotlib☆11Mar 4, 2018Updated 8 years ago
- Open modelling of European power systems in Python: a proof-of-concept☆41Jan 13, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 🏗️ Create APIs from CSV files within seconds, using fastapi☆81May 28, 2021Updated 4 years ago
- Snakemake with pytest example☆11Jul 18, 2020Updated 5 years ago
- 🍯 Sweet simple static site generator with Dune vibes☆14Feb 17, 2026Updated 2 months ago
- Python bindings to libpostal for fast international address parsing/normalization☆874Nov 1, 2025Updated 6 months ago
- NextGIS build of QGIS☆27Oct 20, 2025Updated 6 months ago
- Optimal Wind+Hydrogen+Other+Battery+Solar (WHOBS) electricity systems for European countries☆52Oct 11, 2018Updated 7 years ago
- fast Rust-based SVMlight parser☆11Feb 19, 2021Updated 5 years ago