Repository for performing Blocking using Deep Learning based on the paper "Deep Learning for Blocking in Entity Matching: A Design Space Exploration"
☆31Apr 5, 2023Updated 2 years ago
Alternatives and similar repositories for DeepBlocker
Users that are interested in DeepBlocker are comparing it to the libraries listed below
Sorting:
- ☆18Jun 17, 2024Updated last year
- Implementation of the paper "Deep Indexed Active Learning for Matching Heterogeneous Entity Representations"☆17Dec 20, 2021Updated 4 years ago
- Code for the paper "Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond…☆24May 31, 2022Updated 3 years ago
- This repository contains the code and data download links to reproduce building the WDC Products Benchmark.☆15Jul 13, 2023Updated 2 years ago
- ☆32Apr 15, 2023Updated 2 years ago
- The code of our AAAI'20 paper "GraphER: Token-Centric Entity Resolution with Graph Convolutional Neural Networks"☆11Aug 10, 2020Updated 5 years ago
- JedAI-WebApp is a GUI that facilitates the execution of JedAI. JedAI is an open source, high scalability toolkit that offers out-of-the-b…☆25Apr 14, 2023Updated 2 years ago
- Code for the paper "Deep Entity Matching with Pre-trained Language Models"☆307Apr 17, 2024Updated last year
- An End-to-End Evaluation Framework for Entity Resolution Systems☆36Dec 3, 2023Updated 2 years ago
- ☆192May 29, 2024Updated last year
- Python package for performing Entity and Text Matching using Deep Learning.☆615Jun 18, 2024Updated last year
- Code for 'Geospatial Entity Resolution' paper (WWW 2022)☆19Apr 27, 2023Updated 2 years ago
- Code for the paper "CollaborEM: A Self-supervised Entity Matching Framework Using Multi-features Collaboration". TKDE 2021.☆41Jul 12, 2022Updated 3 years ago
- Foundation Models for Data Tasks☆110May 15, 2023Updated 2 years ago
- To reproduce experiments of the paper "Entity Matching with Transformer Architectures"☆27Nov 4, 2019Updated 6 years ago
- Implementation of many similarity join algorithms.☆15Mar 6, 2014Updated 12 years ago
- An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.☆90Nov 3, 2025Updated 4 months ago
- Continuous Benchmark of Filtering methods for Entity Resolution☆11Jul 20, 2025Updated 8 months ago
- Master thesis - reproducing state-of-the-art schema matching algorithms☆14Jul 6, 2023Updated 2 years ago
- [Machine Learning 2023] NaCL: Noise-Robust Cross-Domain Contrastive Learning for Unsupervised Domain Adaptation☆12Jul 8, 2023Updated 2 years ago
- A Django 2.1 project to reproduce WebKit Bug 188165 and Django Ticket #30250☆15Mar 29, 2019Updated 6 years ago
- Utilities for working with Django's prefetch_related system☆16Jan 12, 2022Updated 4 years ago
- This project focuses on DeepER, a deep learning framework for entity resolution (record deduplication). It examines how DeepER performs o…☆47May 11, 2018Updated 7 years ago
- A tool facilitating matching for any dataset discovery method. Also, an extensible experiment suite for state-of-the-art schema matching …☆105Oct 14, 2025Updated 5 months ago
- This repository contains code and extensive prompt examples to reproduce and extend the experiments in our papers "Using ChatGPT for Enti…☆65Oct 18, 2024Updated last year
- A simple command line interface to the datamade/dedupe library.☆43Dec 26, 2022Updated 3 years ago
- Airlift Challenge starter kit☆10Apr 18, 2025Updated 11 months ago
- ☆11May 11, 2022Updated 3 years ago
- Repo - Paper "Capturing Semantics for Imputation with Pre-trained Language Models." [ICDE 2021]☆10Mar 13, 2022Updated 4 years ago
- A Flexible Deep Learning Approach to Fuzzy String Matching☆150Oct 16, 2024Updated last year
- Resources for PVLDB 2023 submission☆27Aug 28, 2024Updated last year
- Welcome to Snowman App – a Data Matching Benchmark Platform.☆38Feb 9, 2023Updated 3 years ago
- A comprehensive and scalable set of string tokenizers and similarity measures in Python☆143Feb 18, 2026Updated last month
- Code and data for the VLDB 2023 paper: RECA: Related Tables Enhanced Column Semantic Type Annotation Framework☆12May 7, 2025Updated 10 months ago
- Trustworthy Knowledge Graph Completion Based on Multi-sourced Noisy Data, WWW 2022☆14Apr 6, 2022Updated 3 years ago
- Semi-supervised User Profiling with Heterogeneous Graph Attention Networks, IJCAI 19☆25Aug 18, 2020Updated 5 years ago
- Code Release for "On the Inductive Bias of Masked Language Modeling: From Statistical to Syntactic Dependencies"☆16Apr 13, 2021Updated 4 years ago
- GPTBundle, a React application toolkit, harnesses AI to convert textual content into structured forms and delivers advanced autofill sugg…☆22Mar 27, 2024Updated last year
- An example of implementing adversarial discriminative domain adaptation on captcha dataset by using keras☆11Feb 6, 2018Updated 8 years ago