sodadata/soda-core

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sodadata/soda-core)

sodadata / soda-core

Data Contracts engine for the modern data stack. https://www.soda.io

☆2,397

Alternatives and similar repositories for soda-core

Users that are interested in soda-core are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fivetran / great_expectations
View on GitHub
Always know what to expect from your data.
☆11,664Updated this week
re-data / re-data
View on GitHub
re_data - fix data issues before your users & CEO would discover them 😊
☆1,566Apr 30, 2024Updated 2 years ago
sodadata / soda-spark
View on GitHub
Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes
☆64Mar 23, 2026Updated 4 months ago
SQLMesh / sqlmesh
View on GitHub
Scalable and efficient data transformation framework - backwards compatible with dbt.
☆3,214Updated this week
elementary-data / elementary
View on GitHub
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-host…
☆2,380Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
datafold / data-diff
View on GitHub
Compare tables within or across databases
☆2,990May 17, 2024Updated 2 years ago
awslabs / deequ
View on GitHub
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
☆3,636Updated this week
dbt-labs / dbt-core
View on GitHub
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build application…
☆13,507Updated this week
OpenLineage / OpenLineage
View on GitHub
An Open Standard for lineage metadata collection
☆2,558Updated this week
calogica / dbt-expectations
View on GitHub
Port(ish) of Great Expectations to dbt test macros
☆1,229Dec 16, 2024Updated last year
amundsen-io / amundsen
View on GitHub
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…
☆4,781Jul 1, 2026Updated 3 weeks ago
dlt-hub / dlt
View on GitHub
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
☆5,646Updated this week
MarquezProject / marquez
View on GitHub
Collect, aggregate, and visualize a data ecosystem's metadata
☆2,245Updated this week
dagster-io / dagster
View on GitHub
An orchestration platform for the development, production, and observation of data assets.
☆15,883Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
dbt-labs / metricflow
View on GitHub
MetricFlow allows you to define, build, and maintain metrics in code.
☆1,703Updated this week
airbytehq / airbyte
View on GitHub
Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both …
☆21,682Updated this week
sqlfluff / sqlfluff
View on GitHub
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
☆9,820Updated this week
dbt-checkpoint / dbt-checkpoint
View on GitHub
List of `pre-commit` hooks to ensure the quality of your `dbt` projects.
☆754Jun 18, 2026Updated last month
awslabs / python-deequ
View on GitHub
Python API for Deequ
☆823Updated this week
datacontract / datacontract-cli
View on GitHub
Enforce Data Contracts
☆958Updated this week
tobymao / sqlglot
View on GitHub
Python SQL Parser and Transpiler
☆9,454Updated this week
meltano / meltano
View on GitHub
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to wr…
☆2,568Updated this week
datahub-project / datahub
View on GitHub
The Context Platform for your Data and AI Stack
☆12,327Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
mage-ai / mage-ai
View on GitHub
🧙 Build, run, and manage data pipelines for integrating and transforming data.
☆8,778Jul 17, 2026Updated last week
fal-ai / dbt-fal
View on GitHub
do more with dbt. dbt-fal helps you run Python alongside dbt, so you can send Slack alerts, detect anomalies and build machine learning m…
☆853Apr 5, 2024Updated 2 years ago
whylabs / whylogs
View on GitHub
An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model perf…
☆2,828Jan 10, 2025Updated last year
elementary-data / dbt-data-reliability
View on GitHub
This dbt package captures metadata, artifacts, and test results so you can detect anomalies, monitor data quality, and build metadata tab…
☆516Jul 7, 2026Updated 2 weeks ago
fugue-project / fugue
View on GitHub
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…
☆2,170May 19, 2026Updated 2 months ago
projectnessie / nessie
View on GitHub
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
☆1,483Updated this week
unionai-oss / pandera
View on GitHub
A light-weight, flexible, and expressive statistical data testing library
☆4,409Updated this week
dbt-labs / dbt-utils
View on GitHub
Utility functions for dbt projects.
☆1,780Jul 7, 2026Updated 2 weeks ago
duckdb / dbt-duckdb
View on GitHub
dbt adapter for DuckDB
☆1,322Updated this week
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
PrefectHQ / prefect
View on GitHub
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
☆23,466Updated this week
InfuseAI / piperider
View on GitHub
Code review for data in dbt
☆495Jan 3, 2025Updated last year
ibis-project / ibis
View on GitHub
the portable Python dataframe library
☆6,605Updated this week
astronomer / dag-factory
View on GitHub
Construct Apache Airflow DAGs Declaratively via YAML configuration files
☆1,445Updated this week
lightdash / lightdash
View on GitHub
Agentic BI. Analytics at the speed of code ⚡️
☆5,981Updated this week
open-metadata / OpenMetadata
View on GitHub
The Open Context Layer for Data and AI , OpenMetadata is the open platform for building trusted data context and business semantics for …
☆14,544Updated this week
astronomer / astronomer-cosmos
View on GitHub
Run your dbt Core or dbt Fusion projects as Apache Airflow DAGs and Task Groups with a few lines of code
☆1,233Updated this week