data61 / blocklibLinks
Python implementations of record linkage blocking techniques.
☆21Updated 2 years ago
Alternatives and similar repositories for blocklib
Users that are interested in blocklib are comparing it to the libraries listed below
Sorting:
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆66Updated last week
- Python implementation of anonymous linkage using cryptographic linkage keys☆70Updated last year
- CLK hash: hash pii for entity matching☆48Updated 8 months ago
- A browser user interface for manual labeling of record pairs.☆48Updated 2 years ago
- A maximum-strength name parser for record linkage.☆39Updated 5 months ago
- Record Linkage ToolKit (Find and link entities)☆111Updated 2 years ago
- ☆48Updated last year
- Language detection using Spacy and Fasttext☆57Updated 2 years ago
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆46Updated last month
- Python wrapper for a C++ Double Metaphone☆15Updated 3 weeks ago
- Resources for tackling record linkage / deduplication / data matching problems☆126Updated last year
- Record matching and entity resolution at scale in Spark☆36Updated 2 years ago
- Scalable String Similarity Joins in Python☆39Updated last year
- Set-oriented Operations in Pandas☆24Updated 5 years ago
- Primrose modeling framework for simple production models☆33Updated last year
- Interactive notebooks containing demonstration code of the splink library☆40Updated 2 years ago
- PyPi module for Graphlet AI Knowledge Graph Factory☆33Updated 2 years ago
- MLOps simplified. One-stop AI delivery platform, all the features you need.☆106Updated last week
- Instant search for and access to many datasets in Pyspark.☆34Updated 3 years ago
- PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolut…☆161Updated 3 years ago
- 🚀GUI for training spaCy models☆55Updated 4 years ago
- A selection of business datasets☆18Updated 6 years ago
- ☆70Updated 3 years ago
- An automation tool to refactor Jupyter Notebooks to Python modules, with code dependency analysis.☆12Updated 11 months ago
- 💾 Script to import issues from a JIRA instance into a database.☆57Updated 3 years ago
- Now included in rigour☆152Updated 2 months ago
- Python package for deduplication/entity resolution using active learning☆83Updated last year
- data wrangling simplicity, complete audit transparency, and at speed☆35Updated 4 months ago
- Comparing Polars to Pandas and a small introduction☆44Updated 4 years ago
- A small Python module containing quick utility functions for standard ETL processes.☆37Updated last month