"1 + 1 = 1 or Record Deduplication with Python" Jupyter Notebook
☆84Dec 8, 2022Updated 3 years ago
Alternatives and similar repositories for deduplication-slides
Users that are interested in deduplication-slides are comparing it to the libraries listed below
Sorting:
- A browser user interface for manual labeling of record pairs.☆48Jun 23, 2023Updated 2 years ago
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,046Feb 21, 2024Updated 2 years ago
- Ipython notebooks on various topics☆67Oct 4, 2018Updated 7 years ago
- Python wrapper for a C++ Double Metaphone☆15Jan 12, 2026Updated last month
- Wave Partial Differential Equation Solver in Python☆14Jun 5, 2024Updated last year
- Browser automation for creating new pages in WordPress☆13Jun 7, 2025Updated 8 months ago
- A list of free data matching and record linkage software.☆401Feb 21, 2024Updated 2 years ago
- This repo consists all my RL work and learnings☆12Dec 5, 2021Updated 4 years ago
- Demonstrating the efficiency of pmdarima’s auto_arima() function compared to implementing a traditional ARIMA model.☆12Feb 16, 2021Updated 5 years ago
- Demonstration of how dedupe might be used as geocoder☆17Jun 21, 2022Updated 3 years ago
- Scientific Computing in Python, a practical and ultimate tutorials☆14Mar 7, 2023Updated 2 years ago
- A collection of Python scripts☆12Feb 7, 2020Updated 6 years ago
- AlgoTree☆16Jan 30, 2026Updated last month
- Building a maintainable Machine Learning pipeline using DVC☆16Jul 7, 2020Updated 5 years ago
- Examples for using the dedupe library☆419Aug 10, 2024Updated last year
- A python package to create a database on the platform using our moj data warehousing framework☆21Feb 11, 2026Updated 3 weeks ago
- Record linking package that fuzzy matches two Python pandas dataframes using sqlite3 fts4☆286Aug 9, 2022Updated 3 years ago
- Scalable String Similarity Joins in Python☆39Jul 12, 2024Updated last year
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.☆4,440Jul 29, 2025Updated 7 months ago
- The engine behind Vinta's Lessons Learned page.☆38Dec 26, 2022Updated 3 years ago
- ☆23Jan 25, 2023Updated 3 years ago
- motivational website to do something special this month☆21Jan 11, 2024Updated 2 years ago
- Start your journey into social media analysis of politicans by using Python (Tutorial)☆21Mar 26, 2019Updated 6 years ago
- Repository for GH public projects☆18Feb 29, 2024Updated 2 years ago
- Python implementations of record linkage blocking techniques.☆21Oct 2, 2023Updated 2 years ago
- An online jukebox with all the songs from Deezer and YouTube. Built with Django and Angular.☆22Apr 11, 2016Updated 9 years ago
- ☆23May 18, 2021Updated 4 years ago
- ☆21Jan 21, 2023Updated 3 years ago
- The Google Refine Python Client Library provides an interface to communicating with a Google Refine server.☆27Feb 2, 2017Updated 9 years ago
- ☆21Jul 6, 2023Updated 2 years ago
- Link Wikidata items to large catalogs☆96Updated this week
- Simple samples for writing ETL transform scripts in Python☆24Jan 20, 2026Updated last month
- Find out which countries have won the most medals and how the participation of nations has changed over time, with R☆10Aug 22, 2021Updated 4 years ago
- Basic machine learning algorithm implementation☆18Mar 7, 2024Updated last year
- A broker agnostic implementation of outbox and other message resilience patterns for Django apps.☆35Feb 9, 2026Updated 3 weeks ago
- Weibull Analysis Tools☆27May 24, 2025Updated 9 months ago
- ☆26Nov 9, 2019Updated 6 years ago
- A python module for numerical optimization.☆29May 2, 2018Updated 7 years ago
- LSHDB is a parallel and distributed data engine, which relies on Locality-Sensitive Hashing and noSQL systems, for performing record link…☆31Aug 30, 2022Updated 3 years ago