Simplifies use of the Dedupe library via Pandas
☆135Mar 30, 2023Updated 3 years ago
Alternatives and similar repositories for pandas-dedupe
Users that are interested in pandas-dedupe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.☆4,463Jul 29, 2025Updated 9 months ago
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,049Feb 21, 2024Updated 2 years ago
- (Archived) A Python library for record linkage and deduplication.☆19Mar 19, 2024Updated 2 years ago
- Flow and transmission cost allocation in power systems☆17Jul 19, 2023Updated 2 years ago
- The http://analyticsdojo.com open source codebase and curriculum. Learn to data science today.☆38Dec 13, 2016Updated 9 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A list of free data matching and record linkage software.☆405Feb 21, 2024Updated 2 years ago
- A browser user interface for manual labeling of record pairs.☆48Jun 23, 2023Updated 2 years ago
- PyPSATopo is a tool that allows generating the topographical representation of any arbitrary PyPSA-based network☆25Dec 19, 2025Updated 5 months ago
- Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4☆286Aug 9, 2022Updated 3 years ago
- Introduction to the new generation of python dataviz tools☆20Feb 7, 2021Updated 5 years ago
- Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends☆2,158Updated this week
- ☆27Sep 9, 2021Updated 4 years ago
- Fuzzy string matching, grouping, and evaluation.☆796Jul 10, 2025Updated 10 months ago
- Julia enum made nicer☆10May 28, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- PyPSA-DE: High resolution, sector-coupled model of the German Energy System☆46May 13, 2026Updated last week
- Scripts to download the U.S. Department of Justice's National Caseload Data and load it into Amazon Athena for querying☆15May 22, 2023Updated 3 years ago
- PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…☆161Nov 18, 2022Updated 3 years ago
- ☆12Jun 3, 2023Updated 2 years ago
- A patient matching test harness to support PCOR☆16Feb 28, 2017Updated 9 years ago
- Work for Mastering Large Datasets with Python☆20Dec 8, 2022Updated 3 years ago
- Geospatial Land Availability for Energy Systems☆62Apr 30, 2026Updated 3 weeks ago
- Super Fast String Matching in Python☆370Mar 14, 2025Updated last year
- Python/Django application comparing the Amadeus Self-Service Flight Offers Search with the Flight Choice Prediction APIs☆20Sep 24, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Dashboards showing intrinsic meta data for the OMOP-CDM databases in the EHDEN data network☆14Feb 12, 2026Updated 3 months ago
- Showcasing various NLP Downstream tasks Training with pre-trained Language models using Pytorch Lightning☆13Aug 7, 2022Updated 3 years ago
- Pandas in black and white: a collection of opinionated pandas flashcards☆14Feb 15, 2019Updated 7 years ago
- Edit CSV files using a table editor☆23Jul 9, 2021Updated 4 years ago
- Streamlit component for the Yellowbrick visualization and model diagnostics library☆12Jul 8, 2021Updated 4 years ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆63May 11, 2026Updated last week
- PyPSA MCP: PyPSA Energy Modeling for LLMs☆49Mar 20, 2026Updated 2 months ago
- HHS HCC SQL Risk Score Model☆13May 4, 2026Updated 2 weeks ago
- Table Enforcer is my attempt to apply a sort of "test driven development" workflow to data cleaning and validation. A python package to f…☆19Feb 26, 2018Updated 8 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Dedupe/batch geocode addresses and venues around the world with libpostal☆84Nov 29, 2021Updated 4 years ago
- Supplementary code for "Name2Vec: Personal Names Embeddings" presented at The Canadian Conference on AI 2019.☆18Jun 25, 2020Updated 5 years ago
- Tool to help with governance voting on Cardano☆13Updated this week
- A simple command line interface to the datamade/dedupe library.☆43Dec 26, 2022Updated 3 years ago
- micro-library to produce a couple of basic, attractive, printable plots with matplotlib☆11Mar 4, 2018Updated 8 years ago
- 🏗️ Create APIs from CSV files within seconds, using fastapi☆81May 28, 2021Updated 4 years ago
- This library was created in order to evaluate the effectiveness of any kind of algorithm used in IR systems and analyze how well they per…☆15Apr 15, 2020Updated 6 years ago