data61 / clkhash
CLK hash: hash pii for entity matching
☆47Updated last week
Alternatives and similar repositories for clkhash:
Users that are interested in clkhash are comparing it to the libraries listed below
- Python implementation of anonymous linkage using cryptographic linkage keys☆65Updated 11 months ago
- Privacy Preserving Record Linkage Service☆26Updated 2 years ago
- Python implementations of record linkage blocking techniques.☆20Updated last year
- Python wrapper for a C++ Double Metaphone☆15Updated 2 years ago
- A maximum-strength name parser for record linkage.☆37Updated this week
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).☆14Updated 6 years ago
- Framework for processing data packages in pipelines of modular components.☆121Updated 3 months ago
- PMML evaluator library for the PostgreSQL database (http://www.postgresql.org/)☆11Updated 10 years ago
- Resources for tackling record linkage / deduplication / data matching problems☆123Updated last year
- Scalable String Similarity Joins in Python☆39Updated 9 months ago
- A python package to create a database on the platform using our moj data warehousing framework☆21Updated 8 months ago
- Abstractions for feature engineering on large graphs of tabular data.☆21Updated last week
- Demonstration of how dedupe might be used as geocoder☆17Updated 2 years ago
- this repo contains the draft, images, and code for the Medium blog post on altair themes.☆12Updated 6 years ago
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable fro…☆27Updated 2 years ago
- A browser user interface for manual labeling of record pairs.☆47Updated last year
- The privacy-preserving record linkage toolkit: a proof-of-concept public demo of next-gen data linkage techniques.☆10Updated 11 months ago
- @vega transforms with @ibis-project expressions☆29Updated 4 years ago
- Dedupe/batch geocode addresses and venues around the world with libpostal☆82Updated 3 years ago
- 💨🥫 A Data Factory system for running data processing pipelines built on AirFlow and tailored to CKAN. Includes evolution of DataPusher …☆31Updated 2 years ago
- Dexter document monitor for MMA☆16Updated last year
- data wrangling simplicity, complete audit transparency, and at speed☆34Updated last month
- Data Scientist code test☆19Updated 4 years ago
- Algorithms for "schema matching"☆26Updated 8 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆57Updated 3 weeks ago
- (Archived) A Python library for record linkage and deduplication.☆19Updated last year
- CLI for creating databases for Data Quality Dashboards.☆19Updated 5 years ago
- Provide partial dates and retain the date precision through processing☆13Updated 2 years ago
- This repository contains CROW, the Clerical Resolution Online Widget, an open-source project designed to help data linkers with their cle…☆10Updated this week