dell-research-harvard/linktransformer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dell-research-harvard/linktransformer)

dell-research-harvard / linktransformer

A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

☆141

Alternatives and similar repositories for linktransformer

Users that are interested in linktransformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

NickCrews / mismo
View on GitHub
The SQL/Ibis powered sklearn of record linkage
☆24Jun 12, 2026Updated last month
AI-team-UoA / pyJedAI
View on GitHub
An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.
☆97Mar 22, 2026Updated 3 months ago
OlivierBinette / er-evaluation
View on GitHub
An End-to-End Evaluation Framework for Entity Resolution Systems
☆38Dec 3, 2023Updated 2 years ago
joeornstein / fuzzylink
View on GitHub
Probabilistic Record Linkage Using Pretrained Text Embeddings
☆21Jun 30, 2026Updated 3 weeks ago
mwaugh0328 / data_bootcamp_spring_2019
View on GitHub
Repository for in class material for Data Bootcamp
☆14May 18, 2019Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
aaron-lohmann / econ-oss
View on GitHub
This repository aims to build a comprehensive literature review of the economics of open source software. Contributions welcome.
☆11Apr 2, 2025Updated last year
scify / JedAI-Spark
View on GitHub
☆15Aug 11, 2022Updated 3 years ago
cleanzr / dblink
View on GitHub
Distributed Bayesian Entity Resolution in Apache Spark
☆60Jun 10, 2021Updated 5 years ago
mkearney / nyt
View on GitHub
📰🗞 New York Times data
☆12Aug 4, 2018Updated 7 years ago
geshijoker / PoliPrompt
View on GitHub
The repository for PoliPrompt
☆18Oct 20, 2024Updated last year
giacfalk / blackmaRble
View on GitHub
blackmaRble: retrieve, wrangle and plot VIIRS Black Marble nighttimelight data in R
☆18Dec 21, 2023Updated 2 years ago
prometheus-eval / scaling-evaluation-compute
View on GitHub
Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"
☆12Mar 25, 2025Updated last year
resteorts / almost-all-of-er
View on GitHub
☆11Apr 2, 2021Updated 5 years ago
ncn-foreigners / blocking
View on GitHub
An R package for blocking records for record linkage / data deduplication based on approximate nearest neighbours algorithms.
☆14Jun 30, 2026Updated 3 weeks ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
gpapadis / ContinuousFilteringBenchmark
View on GitHub
Continuous Benchmark of Filtering methods for Entity Resolution
☆11Jul 20, 2025Updated last year
sischei / crest_comp_econ
View on GitHub
☆13Jan 10, 2023Updated 3 years ago
cleanzr / record-linkage-tutorial
View on GitHub
A tutorial on entity resolution (record linkage or de-duplication)
☆66Jun 30, 2020Updated 6 years ago
py-econometrics / jaxonometrics
View on GitHub
Econometrics on the GPU (and CPU) via JAX
☆16Jul 12, 2025Updated last year
ilostat / Rilostat
View on GitHub
Tools for ILO Open Data via ILOSTAT bulk download facility or SDMX web service
☆38Apr 24, 2026Updated 2 months ago
wbsg-uni-mannheim / MatchGPT
View on GitHub
This repository contains code and extensive prompt examples to reproduce and extend the experiments in our papers "Using ChatGPT for Enti…
☆67Oct 18, 2024Updated last year
OpenRG / OGprimer
View on GitHub
Repository for introductory training materials for overlapping generations modeling
☆13Oct 30, 2024Updated last year
vintasoftware / entity-embed
View on GitHub
PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…
☆161Nov 18, 2022Updated 3 years ago
econabhishek / datagovindia
View on GitHub
This is an R wrapper for the APIs on government of India's open data platform - data.gov.in.
☆18Sep 22, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
HighDimensionalEconLab / VarianceComponentsHDFE.jl
View on GitHub
☆21Jan 4, 2024Updated 2 years ago
richardli / SUMMER
View on GitHub
SAE Unit/area Models and Methods for Estimation in R
☆26May 10, 2026Updated 2 months ago
anhaidgroup / sparkly
View on GitHub
☆19Apr 27, 2026Updated 2 months ago
OlivierBinette / StringCompare
View on GitHub
Efficient String Comparison Functions and Fuzzy String Matching
☆21Sep 21, 2025Updated 10 months ago
stevencarlislewalker / lme4ord
View on GitHub
Ecological mixed-effects ordination with lme4
☆12May 9, 2016Updated 10 years ago
AI-team-UoA / JedAI-WebApp
View on GitHub
JedAI-WebApp is a GUI that facilitates the execution of JedAI. JedAI is an open source, high scalability toolkit that offers out-of-the-b…
☆26Apr 14, 2023Updated 3 years ago
in-rolls / parse_searchable_rolls
View on GitHub
Parse Searchable Electoral Rolls
☆13Apr 20, 2025Updated last year
hollina / stacked-did-weights
View on GitHub
☆25Jan 25, 2024Updated 2 years ago
ajl2718 / whereabouts
View on GitHub
Fast, accurate, open-source geocoding in Python
☆77Jul 5, 2026Updated 2 weeks ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
bryangraham / ipt
View on GitHub
Tilting estimators for program evaluation for Python 3
☆10Oct 31, 2019Updated 6 years ago
OlivierBinette / Awesome-Entity-Resolution
View on GitHub
List of entity resolution software and resources.
☆132Mar 24, 2026Updated 3 months ago
AakaashRao / starbility
View on GitHub
Coefficient stability plots in R
☆53Jan 29, 2021Updated 5 years ago
moj-analytical-services / splink
View on GitHub
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
☆2,270Updated this week
jeffreyesun / Bucephalus.jl
View on GitHub
A Julia package for solving heterogenous-agent economic models using reinforcement learning
☆20Jul 28, 2022Updated 3 years ago
rmadhok / enviro-dev-grad
View on GitHub
Graduate Environment & Development Economics at the University of Minnesota
☆21Jan 21, 2025Updated last year
zzstoatzz / raggy
View on GitHub
scraping and querying documents for LLMs
☆24Oct 6, 2025Updated 9 months ago