Simplifies use of the Dedupe library via Pandas
☆137Mar 30, 2023Updated 2 years ago
Alternatives and similar repositories for pandas-dedupe
Users that are interested in pandas-dedupe are comparing it to the libraries listed below
Sorting:
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.☆4,445Jul 29, 2025Updated 7 months ago
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,048Feb 21, 2024Updated 2 years ago
- Examples for using the dedupe library☆419Aug 10, 2024Updated last year
- (Archived) A Python library for record linkage and deduplication.☆19Mar 19, 2024Updated 2 years ago
- Record Linkage ToolKit (Find and link entities)☆111Aug 14, 2023Updated 2 years ago
- The http://analyticsdojo.com open source codebase and curriculum. Learn to data science today.☆38Dec 13, 2016Updated 9 years ago
- A list of free data matching and record linkage software.☆401Feb 21, 2024Updated 2 years ago
- A browser user interface for manual labeling of record pairs.☆48Jun 23, 2023Updated 2 years ago
- PyPSATopo is a tool that allows generating the topographical representation of any arbitrary PyPSA-based network☆24Dec 19, 2025Updated 3 months ago
- Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4☆286Aug 9, 2022Updated 3 years ago
- Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends☆2,013Updated this week
- Fuzzy string matching, grouping, and evaluation.☆792Jul 10, 2025Updated 8 months ago
- A collection of python utility functions☆11Mar 12, 2026Updated last week
- Julia enum made nicer☆10May 28, 2020Updated 5 years ago
- A content-filtering bypass system developed specifically to allow access to trans-related resources on public networks (libraries, school…☆27Nov 15, 2014Updated 11 years ago
- Scripts to download the U.S. Department of Justice's National Caseload Data and load it into Amazon Athena for querying☆15May 22, 2023Updated 2 years ago
- PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…☆161Nov 18, 2022Updated 3 years ago
- ☆12Jun 3, 2023Updated 2 years ago
- Super Fast String Matching in Python☆370Mar 14, 2025Updated last year
- Showcasing various NLP Downstream tasks Training with pre-trained Language models using Pytorch Lightning☆13Aug 7, 2022Updated 3 years ago
- Pandas in black and white: a collection of opinionated pandas flashcards☆14Feb 15, 2019Updated 7 years ago
- Edit CSV files using a table editor☆23Jul 9, 2021Updated 4 years ago
- PyPSA MCP: PyPSA Energy Modeling for LLMs☆49Updated this week
- Streamlit component for the Yellowbrick visualization and model diagnostics library☆12Jul 8, 2021Updated 4 years ago
- HHS HCC SQL Risk Score Model☆13Feb 13, 2026Updated last month
- Analysis and breakdown of text messages (WhatsApp) history, to understand the small things in a relationship☆14Apr 8, 2019Updated 6 years ago
- Table Enforcer is my attempt to apply a sort of "test driven development" workflow to data cleaning and validation. A python package to f…☆19Feb 26, 2018Updated 8 years ago
- Supplementary code for "Name2Vec: Personal Names Embeddings" presented at The Canadian Conference on AI 2019.☆18Jun 25, 2020Updated 5 years ago
- 🔎 Finds fuzzy matches between CSV files☆191Mar 26, 2025Updated 11 months ago
- Creation of LDA (Latent Dirichlet Allocation) Topic Model on corpus of books harvested from Project Gutenberg☆27Apr 5, 2018Updated 7 years ago
- A simple command line interface to the datamade/dedupe library.☆43Dec 26, 2022Updated 3 years ago
- micro-library to produce a couple of basic, attractive, printable plots with matplotlib☆11Mar 4, 2018Updated 8 years ago
- A minimum reproducible repository for embedding panel in FastAPI☆18Nov 3, 2021Updated 4 years ago
- Open modelling of European power systems in Python: a proof-of-concept☆40Jan 13, 2023Updated 3 years ago
- 🏗️ Create APIs from CSV files within seconds, using fastapi☆81May 28, 2021Updated 4 years ago
- 🍯 Sweet simple static site generator with Dune vibes☆13Feb 17, 2026Updated last month
- Library for unit extraction - fork of quantulum for python3☆144Mar 9, 2026Updated last week
- This library was created in order to evaluate the effectiveness of any kind of algorithm used in IR systems and analyze how well they per…☆15Apr 15, 2020Updated 5 years ago
- A pytorch implementation of "SuperTML: Two-Dimensional Word Embedding for the Precognition on Structured Tabular Data"☆29Jul 23, 2019Updated 6 years ago