moj-analytical-services / splinkLinks
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
☆1,939Updated this week
Alternatives and similar repositories for splink
Users that are interested in splink are comparing it to the libraries listed below
Sorting:
- dbt adapter for DuckDB☆1,226Updated this week
- A powerful and modular toolkit for record linkage and duplicate detection in Python☆1,045Updated last year
- Scalable identity resolution, entity resolution, data mastering and deduplication using ML☆1,146Updated this week
- Monte Carlo simulation of the NBA season, leveraging dbt, duckdb and evidence.dev☆588Updated 2 weeks ago
- Repository for the ActivitySchema spec and supporting materials☆437Updated 3 years ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,136Updated last week
- Scalable and efficient data transformation framework - backwards compatible with dbt.☆2,891Updated this week
- Data Contracts engine for the modern data stack. https://www.soda.io☆2,281Updated last week
- Malloy is a modern open source language for describing data relationships and transformations.☆2,395Updated this week
- Dagster Labs' open-source data platform, built with Dagster.☆437Updated this week
- do more with dbt. dbt-fal helps you run Python alongside dbt, so you can send Slack alerts, detect anomalies and build machine learning m…☆856Updated last year
- Port(ish) of Great Expectations to dbt test macros☆1,204Updated last year
- Lightweight and extensible compatibility layer between dataframe libraries!☆1,519Updated this week
- Turning PySpark Into a Universal DataFrame API☆485Updated last week
- Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.☆2,482Updated this week
- Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.☆798Updated this week
- re_data - fix data issues before your users & CEO would discover them 😊☆1,569Updated last year
- Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to wr…☆2,341Updated this week
- A light-weight, flexible, and expressive statistical data testing library☆4,190Updated this week
- dbt + Metabase integration☆572Updated last week
- MetricFlow allows you to define, build, and maintain metrics in code.☆1,468Updated last week
- Python API for Deequ☆810Updated 3 weeks ago
- The smallest DuckDB SQL orchestrator on Earth.☆336Updated 2 months ago
- A list of free data matching and record linkage software.☆401Updated last year
- ☆385Updated 2 years ago
- 🦆 A curated list of awesome DuckDB resources☆2,266Updated last week
- Fastest library to load data from DB to DataFrames in Rust and Python☆2,556Updated last week
- ☆396Updated last week
- Better SQL in Jupyter. 📊☆840Updated last month
- An extensible framework for linking databases and interactive views.☆1,240Updated last week