A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning
☆137Feb 15, 2026Updated 2 months ago
Alternatives and similar repositories for linktransformer
Users that are interested in linktransformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The SQL/Ibis powered sklearn of record linkage☆23Mar 29, 2026Updated 3 weeks ago
- An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.☆92Mar 22, 2026Updated 3 weeks ago
- An End-to-End Evaluation Framework for Entity Resolution Systems☆36Dec 3, 2023Updated 2 years ago
- Probabilistic Record Linkage Using Pretrained Text Embeddings☆18Updated this week
- Repository for in class material for Data Bootcamp☆14May 18, 2019Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This repository aims to build a comprehensive literature review of the economics of open source software. Contributions welcome.☆12Apr 2, 2025Updated last year
- ☆15Aug 11, 2022Updated 3 years ago
- Distributed Bayesian Entity Resolution in Apache Spark☆59Jun 10, 2021Updated 4 years ago
- The repository for PoliPrompt☆18Oct 20, 2024Updated last year
- blackmaRble: retrieve, wrangle and plot VIIRS Black Marble nighttimelight data in R☆18Dec 21, 2023Updated 2 years ago
- Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"☆12Mar 25, 2025Updated last year
- pseudopeople is a Python package that generates realistic simulated data about a fictional United States population, designed for use in …☆24Mar 25, 2026Updated 3 weeks ago
- An R package for blocking records for record linkage / data deduplication based on approximate nearest neighbours algorithms.☆14Apr 9, 2026Updated last week
- A tutorial on entity resolution (record linkage or de-duplication)☆65Jun 30, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- This repository contains code and extensive prompt examples to reproduce and extend the experiments in our papers "Using ChatGPT for Enti…☆66Oct 18, 2024Updated last year
- ☆18Mar 18, 2026Updated last month
- Blocking records for record linkage and data deduplication based on ANN algorithms in Python.☆20Mar 9, 2026Updated last month
- ☆19Jul 22, 2023Updated 2 years ago
- Causal Inference in Observational Data with Unobserved Heterogeneity (Lecture Notes. Masters/PhD-level)☆40Feb 10, 2026Updated 2 months ago
- Repository for introductory training materials for overlapping generations modeling☆13Oct 30, 2024Updated last year
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,046Feb 21, 2024Updated 2 years ago
- ☆19Jan 4, 2024Updated 2 years ago
- Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends☆2,083Updated this week
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Similarity and distance measures for clustering and record linkage applications in R☆18Sep 23, 2025Updated 6 months ago
- From Pedro Sant'Anna's A ready-to-fork Claude Code template for academics using LaTeX/Beamer + R. Multi-agent review, quality gates, adve…☆149Apr 11, 2026Updated last week
- Ecological mixed-effects ordination with lme4☆12May 9, 2016Updated 9 years ago
- Demo of a supervised machine learning approach for Entity Resolution in graph using Neo4j GDS Link Prediction Pipelines☆22Apr 11, 2022Updated 4 years ago
- Graduate Environment & Development Economics at the University of Minnesota☆21Jan 21, 2025Updated last year
- Parse Searchable Electoral Rolls☆12Apr 20, 2025Updated last year
- Implements several Markov chain Monte Carlo (MCMC) algorithms for the latent Dirichlet allocation (LDA) model☆11Feb 11, 2020Updated 6 years ago
- ☆23Jul 15, 2024Updated last year
- Named Entity Recognition with the Nametag Maximum Entropy Markov model☆12Feb 9, 2026Updated 2 months ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Evaluation and benchmarking of PatentsView disambiguation algorithms☆15Jan 18, 2024Updated 2 years ago
- A Julia package for solving heterogenous-agent economic models using reinforcement learning☆19Jul 28, 2022Updated 3 years ago
- scraping and querying documents for LLMs☆24Oct 6, 2025Updated 6 months ago
- SAE Unit/area Models and Methods for Estimation in R☆26Updated this week
- A maximum-strength name parser for record linkage.☆40Sep 3, 2025Updated 7 months ago
- ☆56Updated this week
- R package fastLink: Fast Probabilistic Record Linkage☆291Feb 28, 2026Updated last month