Lyonk71 / pandas-dedupeView external linksLinks
Simplifies use of the Dedupe library via Pandas
☆136Mar 30, 2023Updated 2 years ago
Alternatives and similar repositories for pandas-dedupe
Users that are interested in pandas-dedupe are comparing it to the libraries listed below
Sorting:
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.☆4,436Jul 29, 2025Updated 6 months ago
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,045Feb 21, 2024Updated last year
- Examples for using the dedupe library☆416Aug 10, 2024Updated last year
- A collection of python utility functions☆11Updated this week
- ☆11Updated this week
- A browser user interface for manual labeling of record pairs.☆48Jun 23, 2023Updated 2 years ago
- Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4☆286Aug 9, 2022Updated 3 years ago
- Pandas in black and white: a collection of opinionated pandas flashcards☆14Feb 15, 2019Updated 7 years ago
- Record Linkage ToolKit (Find and link entities)☆111Aug 14, 2023Updated 2 years ago
- A list of free data matching and record linkage software.☆401Feb 21, 2024Updated last year
- Table Enforcer is my attempt to apply a sort of "test driven development" workflow to data cleaning and validation. A python package to f…☆19Feb 26, 2018Updated 7 years ago
- (Archived) A Python library for record linkage and deduplication.☆19Mar 19, 2024Updated last year
- Python script for processing DrugBank XML to MySQL-ready CSV files☆19Mar 6, 2017Updated 8 years ago
- 청와대 국민청원 데이터 아카이브☆15Aug 29, 2020Updated 5 years ago
- Fuzzy string matching, grouping, and evaluation.☆788Jul 10, 2025Updated 7 months ago
- Introduction to the new generation of python dataviz tools☆20Feb 7, 2021Updated 5 years ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Sep 1, 2021Updated 4 years ago
- Complementary code for blog posts☆24Jan 11, 2025Updated last year
- PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…☆161Nov 18, 2022Updated 3 years ago
- Tutorial on Pandas at PyData Amsterdam, 9am to midday, Friday 7 April 2017☆19Apr 7, 2017Updated 8 years ago
- PyPSA-DE: High resolution, sector-coupled model of the German Energy System☆38Updated this week
- Library of automation tools for EDA and modeling☆27Feb 7, 2021Updated 5 years ago
- 🔎 Finds fuzzy matches between CSV files☆191Mar 26, 2025Updated 10 months ago
- Repository of web and code editor friendly Observable Data Toools 🛠️ and Notebooks 📚 in .js, .nb.json, .ojs, .omd, .html and .qmd docum…☆32Aug 31, 2023Updated 2 years ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆63Feb 2, 2026Updated last week
- Company Name Processor written in Python☆350Jan 16, 2026Updated 3 weeks ago
- variations of the record linkage model of Steorts et al. AISTATS 2014's "SMERED: A Bayesian Approach to Graphical Record Linkage and De-d…☆26Mar 13, 2017Updated 8 years ago
- Super Fast String Matching in Python☆371Mar 14, 2025Updated 11 months ago
- Gibsonify — Collect nutritional data using Gibson's method!☆11Oct 28, 2023Updated 2 years ago
- 🚀 Implementation of easy-to-use 3D parallelism based on Huggingface Transformers & Microsoft DeepSpeed☆31Feb 5, 2022Updated 4 years ago
- Scalable identity resolution, entity resolution, data mastering and deduplication using ML☆1,146Updated this week
- Skinfer is a tool for inferring and merging JSON schemas☆141Apr 24, 2024Updated last year
- 🏗️ Create APIs from CSV files within seconds, using fastapi☆80May 28, 2021Updated 4 years ago
- A python implementation of the sinaplot using matplotlib and seaborn☆11Jun 5, 2018Updated 7 years ago
- PyPSA MCP: PyPSA Energy Modeling for LLMs☆45Apr 22, 2025Updated 9 months ago
- The best Python package for comparing two dataframes☆11Dec 29, 2021Updated 4 years ago
- The MCP Strava Server facilitates seamless integration between Strava APIs and Claude for Desktop.☆11Feb 3, 2026Updated last week
- Ziplist Recipe Plugin for Wordpress☆20Aug 26, 2025Updated 5 months ago
- This is a courseware for DataScienceWithPython☆14Sep 5, 2025Updated 5 months ago