data61 / anonlink
Python implementation of anonymous linkage using cryptographic linkage keys
☆63Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for anonlink
- CLK hash: hash pii for entity matching☆47Updated last year
- Privacy Preserving Record Linkage Service☆26Updated last year
- Distributed Bayesian Entity Resolution in Apache Spark☆57Updated 3 years ago
- A maximum-strength name parser for record linkage.☆34Updated 3 months ago
- Python implementations of record linkage blocking techniques.☆19Updated last year
- The SQL/Ibis powered sklearn of record linkage☆14Updated this week
- Python wrapper for a C++ Double Metaphone☆15Updated last year
- A simple command line interface to the datamade/dedupe library.☆42Updated last year
- Resources for tackling record linkage / deduplication / data matching problems☆112Updated 9 months ago
- List of entity resolution software and resources.☆38Updated 8 months ago
- Framework for processing data packages in pipelines of modular components.☆119Updated last year
- Command line tool to convert spreadsheets to databases, made for the UK's Office for National Statistics.☆78Updated 11 months ago
- An experimental Athena extension for DuckDB 🐤☆50Updated 9 months ago
- Linear regression in SQL using dbt☆66Updated last month
- pyspark-parallelised functions producing graph-theoretical metrics in connected component clusters for use in record-linkage (or other do…☆10Updated last year
- @vega transforms with @ibis-project expressions☆30Updated 3 years ago
- A python package to create a database on the platform using our moj data warehousing framework☆21Updated 2 months ago
- data wrangling simplicity, complete audit transparency, and at speed☆35Updated 2 months ago
- ☆13Updated 5 years ago
- A browser user interface for manual labeling of record pairs.☆41Updated last year
- Embedded MonetDB with a Python frontend and fast Numpy/Pandas support☆61Updated last month
- ☆10Updated 4 years ago
- An open source data analysis platform with features for users with a range of technical skills☆45Updated this week
- Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).☆14Updated 5 years ago
- Convert a CSV to a parquet file.☆64Updated last year
- pseudopeople is a Python package that generates realistic simulated data about a fictional United States population, designed for use in …☆20Updated this week
- Model drift detection☆11Updated last year
- Interactive notebooks containing demonstration code of the splink library☆37Updated 10 months ago
- Fork of the Freely Extensible Biomedical Record Linkage program☆24Updated 8 years ago
- Copy Pandas DataFrames and HDF5 files to PostgreSQL database☆54Updated 3 months ago