data61 / blocklibLinks
Python implementations of record linkage blocking techniques.
☆21Updated 2 years ago
Alternatives and similar repositories for blocklib
Users that are interested in blocklib are comparing it to the libraries listed below
Sorting:
- Python wrapper for a C++ Double Metaphone☆15Updated last month
- CLK hash: hash pii for entity matching☆47Updated 7 months ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆64Updated last week
- A maximum-strength name parser for record linkage.☆39Updated 4 months ago
- Python implementation of anonymous linkage using cryptographic linkage keys☆70Updated last year
- Scalable String Similarity Joins in Python☆39Updated last year
- PyPi module for Graphlet AI Knowledge Graph Factory☆33Updated 2 years ago
- A browser user interface for manual labeling of record pairs.☆48Updated 2 years ago
- API client for Aleph, supports bulk entity and document upload.☆29Updated last year
- ☆48Updated last year
- Record matching and entity resolution at scale in Spark☆36Updated 2 years ago
- Now included in rigour☆152Updated last month
- Record Linkage ToolKit (Find and link entities)☆111Updated 2 years ago
- Trying to generate name synonyms from wikidata☆34Updated 5 years ago
- High-performance data retrieval from Neo4j with Apache Arrow 🏹☆32Updated 3 years ago
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆46Updated last month
- A tool to read CSV files with CSVW metadata and transform them into other formats.☆33Updated 6 years ago
- data wrangling simplicity, complete audit transparency, and at speed☆35Updated 3 months ago
- Fuzzy Categorical Distances☆14Updated 5 years ago
- Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully custom…☆45Updated this week
- Loading OpenSanctions into Neo4J and Linkurious☆31Updated last year
- A simple command line interface to the datamade/dedupe library.☆43Updated 3 years ago
- A selection of business datasets☆18Updated 6 years ago
- Reference Graph Gists☆45Updated 4 years ago
- This is the repo for the Giotto-tda use-cases challenge 2020.☆23Updated 4 years ago
- Interactive notebooks containing demonstration code of the splink library☆40Updated last year
- A small Python module containing quick utility functions for standard ETL processes.☆37Updated 2 weeks ago
- Copy Pandas DataFrames and HDF5 files to PostgreSQL database☆55Updated last month
- Hierarchical Clustering Algorithms☆36Updated 3 years ago
- Privacy Preserving Record Linkage Service☆27Updated 2 years ago