A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning
☆135Feb 15, 2026Updated 3 weeks ago
Alternatives and similar repositories for linktransformer
Users that are interested in linktransformer are comparing it to the libraries listed below
Sorting:
- The SQL/Ibis powered sklearn of record linkage☆24Feb 9, 2026Updated last month
- An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.☆89Nov 3, 2025Updated 4 months ago
- An End-to-End Evaluation Framework for Entity Resolution Systems☆36Dec 3, 2023Updated 2 years ago
- Repository for in class material for Data Bootcamp☆13May 18, 2019Updated 6 years ago
- Continuous Benchmark of Filtering methods for Entity Resolution☆11Jul 20, 2025Updated 7 months ago
- 📰🗞 New York Times data☆12Aug 4, 2018Updated 7 years ago
- This repository aims to build a comprehensive literature review of the economics of open source software. Contributions welcome.☆12Apr 2, 2025Updated 11 months ago
- Specification Curve is a Python package that performs specification curve analysis: exploring how a coefficient varies under multiple dif…☆29Feb 27, 2026Updated last week
- pseudopeople is a Python package that generates realistic simulated data about a fictional United States population, designed for use in …☆24Feb 20, 2026Updated 2 weeks ago
- A fast TUI application (with optional webui) to visually navigate and inspect JSON and JSONL data. Easily localize parse errors in large …☆15Sep 30, 2024Updated last year
- Ecological mixed-effects ordination with lme4☆12May 9, 2016Updated 9 years ago
- a subset of sql dialect for clickhouse db.☆13Jan 9, 2023Updated 3 years ago
- Parse Searchable Electoral Rolls☆11Apr 20, 2025Updated 10 months ago
- Blocking records for record linkage and data deduplication based on ANN algorithms in Python.☆19Nov 28, 2025Updated 3 months ago
- An R package for blocking records for record linkage / data deduplication based on approximate nearest neighbours algorithms.☆14Feb 8, 2026Updated last month
- PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…☆161Nov 18, 2022Updated 3 years ago
- ☆18Feb 12, 2026Updated 3 weeks ago
- Implements several Markov chain Monte Carlo (MCMC) algorithms for the latent Dirichlet allocation (LDA) model☆11Feb 11, 2020Updated 6 years ago
- Repository for introductory training materials for overlapping generations modeling☆13Oct 30, 2024Updated last year
- GLADIS: A General and Large Acronym Disambiguation Benchmark (EACL 23)☆18Jun 24, 2024Updated last year
- Text generation using language models with multiple exit heads☆16Sep 18, 2025Updated 5 months ago
- Repository for performing Blocking using Deep Learning based on the paper "Deep Learning for Blocking in Entity Matching: A Design Space …☆32Apr 5, 2023Updated 2 years ago
- A Tool for the Congress Data dataset☆25Dec 8, 2025Updated 3 months ago
- List of entity resolution software and resources.☆109Feb 22, 2025Updated last year
- Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends☆1,996Updated this week
- ☆23Jan 25, 2024Updated 2 years ago
- ☆19Jan 4, 2024Updated 2 years ago
- Implementation of the paper "Deep Indexed Active Learning for Matching Heterogeneous Entity Representations"☆17Dec 20, 2021Updated 4 years ago
- UI for JedAI Toolkit☆17May 20, 2022Updated 3 years ago
- ☆22Jul 15, 2024Updated last year
- Income Accounting☆17Feb 11, 2021Updated 5 years ago
- econometrics in pytorch☆28Feb 18, 2026Updated 2 weeks ago
- Slides and homework for model based inference☆13Sep 26, 2017Updated 8 years ago
- Computational Text Analysis Workshop Materials☆36May 6, 2016Updated 9 years ago
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,046Feb 21, 2024Updated 2 years ago
- [ICML 2024] Recurrent Distance Filtering for Graph Representation Learning☆15Jun 10, 2024Updated last year
- Zero-Shot Learning in Named Entity Recognition with Common Sense Knowledge☆17Nov 16, 2021Updated 4 years ago
- Similarity and distance measures for clustering and record linkage applications in R☆18Sep 23, 2025Updated 5 months ago
- A maximum-strength name parser for record linkage.☆39Sep 3, 2025Updated 6 months ago