data61 / blocklib
Python implementations of record linkage blocking techniques.
☆19Updated last year
Related projects ⓘ
Alternatives and complementary repositories for blocklib
- CLK hash: hash pii for entity matching☆47Updated last year
- Privacy Preserving Record Linkage Service☆26Updated last year
- Python implementation of anonymous linkage using cryptographic linkage keys☆63Updated 6 months ago
- Python wrapper for a C++ Double Metaphone☆15Updated last year
- ☆13Updated 5 years ago
- Burglary prediction for mortals☆10Updated 6 months ago
- Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).☆14Updated 5 years ago
- Graphistry admin docs: launch, configure, use, & debug☆23Updated last week
- Pipeline Explorer - Explore and analyze millions of pipelines learned using MLBlocks and MLPrimitives.☆17Updated last year
- A maximum-strength name parser for record linkage.☆34Updated 3 months ago
- Collaboration app for sharing and reviewing jupyter notebooks☆16Updated last year
- A financial disclosure data extraction tool.☆13Updated last year
- Quickly compare changes made to Jupyter notebooks in GitHub repositories with jupydiff!☆13Updated last year
- Plugin for Intake to read from SQL servers☆15Updated last year
- Python library for MIME type parsing, normalisation and grouping.☆12Updated last week
- Privacy preserving synthetic data generation workflows☆20Updated 2 years ago
- IWAAN - An interactive Jupyter Notebook collection that allows to run analyses of Wikipedia article editing dynamics out-of-the-box on Bi…☆9Updated 6 months ago
- ☆15Updated 2 years ago
- Record matching and entity resolution at scale in Spark☆31Updated last year
- Scrapers for US municipal governments.☆10Updated last year
- Apache Spark based framework for analysis A/B experiments☆11Updated 2 weeks ago
- Datasette plugin providing instructions for exporting data to Jupyter or Observable☆12Updated last year
- JedAI-WebApp is a GUI that facilitates the execution of JedAI. JedAI is an open source, high scalability toolkit that offers out-of-the-b…☆23Updated last year
- The Panama Papers dataset and guide from the International Consortium of Investigative Journalists (ICIJ)☆11Updated 3 weeks ago
- Scalable String Similarity Joins in Python☆39Updated 4 months ago
- Datasette plugin for authenticating access using API tokens☆12Updated 2 months ago
- Code that accompanies the PyData New York (2022) talk: Addressing the sensitivity of Large language models☆12Updated 2 years ago
- A set of tools to accelerate work in Jupyter notebooks.☆11Updated 4 years ago
- Simple interface to read, organize, and manipulate structured data in files on local and cloud storage☆32Updated last year
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆52Updated 3 weeks ago