data61 / blocklibLinks
Python implementations of record linkage blocking techniques.
☆21Updated 2 years ago
Alternatives and similar repositories for blocklib
Users that are interested in blocklib are comparing it to the libraries listed below
Sorting:
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆62Updated last week
 - Record matching and entity resolution at scale in Spark☆35Updated 2 years ago
 - A maximum-strength name parser for record linkage.☆38Updated 2 months ago
 - PyPi module for Graphlet AI Knowledge Graph Factory☆31Updated 2 years ago
 - ☆48Updated last year
 - Python wrapper for a C++ Double Metaphone☆15Updated 3 weeks ago
 - Python implementation of anonymous linkage using cryptographic linkage keys☆68Updated last year
 - A browser user interface for manual labeling of record pairs.☆47Updated 2 years ago
 - CLK hash: hash pii for entity matching☆47Updated 5 months ago
 - Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆46Updated this week
 - Scalable String Similarity Joins in Python☆39Updated last year
 - data wrangling simplicity, complete audit transparency, and at speed☆35Updated last month
 - Record Linkage ToolKit (Find and link entities)☆109Updated 2 years ago
 - Set-oriented Operations in Pandas☆24Updated 5 years ago
 - Comparing Polars to Pandas and a small introduction☆44Updated 4 years ago
 - Resources for tackling record linkage / deduplication / data matching problems☆125Updated last year
 - A small Python module containing quick utility functions for standard ETL processes.☆36Updated this week
 - Now included in rigour☆152Updated last month
 - A selection of business datasets☆18Updated 6 years ago
 - Extract information from XBRL files in the ESEF format☆12Updated last week
 - MLOps simplified. One-stop AI delivery platform, all the features you need.☆103Updated this week
 - Language detection using Spacy and Fasttext☆57Updated last year
 - Reddit Gender Text-Classification.☆11Updated 2 years ago
 - GraphiPy: Universal Social Data Extractor☆82Updated 2 years ago
 - Docker template for basic data science packages to interface with Neo4j☆14Updated 3 years ago
 - Trying to generate name synonyms from wikidata☆34Updated 5 years ago
 - Entity Matching Model solves the problem of matching company names between two possibly very large datasets.☆83Updated 8 months ago
 - High-performance data retrieval from Neo4j with Apache Arrow 🏹☆31Updated 3 years ago
 - Loading OpenSanctions into Neo4J and Linkurious☆30Updated 10 months ago
 - Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆79Updated last year