great-expectations/great_expectations

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/great-expectations/great_expectations)

great-expectations / great_expectations

Always know what to expect from your data.

☆11,556

Alternatives and similar repositories for great_expectations

Users that are interested in great_expectations are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

dbt-labs / dbt-core
View on GitHub
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build application…
☆12,990Updated this week
sodadata / soda-core
View on GitHub
Data Contracts engine for the modern data stack. https://www.soda.io
☆2,372Updated this week
awslabs / deequ
View on GitHub
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
☆3,622Updated this week
dagster-io / dagster
View on GitHub
An orchestration platform for the development, production, and observation of data assets.
☆15,699Updated this week
amundsen-io / amundsen
View on GitHub
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…
☆4,771Jun 1, 2026Updated 2 weeks ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
kedro-org / kedro
View on GitHub
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and…
☆10,887Updated this week
PrefectHQ / prefect
View on GitHub
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
☆22,598Updated this week
unionai-oss / pandera
View on GitHub
A light-weight, flexible, and expressive statistical data testing library
☆4,376Jun 12, 2026Updated last week
treeverse / dvc
View on GitHub
🦉 Data Versioning and ML Experiments
☆15,675Jun 8, 2026Updated last week
Data-Centric-AI-Community / fg-data-profiling
View on GitHub
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
☆13,602Apr 22, 2026Updated last month
Netflix / metaflow
View on GitHub
Build, Manage and Deploy AI/ML Systems
☆10,133Updated this week
apache / airflow
View on GitHub
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
☆45,788Jun 12, 2026Updated last week
feast-dev / feast
View on GitHub
The Open Source Feature Store for AI/ML
☆7,095Updated this week
datahub-project / datahub
View on GitHub
The Context Platform for your Data and AI Stack
☆12,075Jun 12, 2026Updated last week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
mlflow / mlflow
View on GitHub
The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, a…
☆26,506Updated this week
modin-project / modin
View on GitHub
Modin: Scale your Pandas workflows by changing a single line of code
☆10,388Feb 10, 2026Updated 4 months ago
sqlfluff / sqlfluff
View on GitHub
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
☆9,743Jun 11, 2026Updated last week
airbytehq / airbyte
View on GitHub
Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both …
☆21,449Updated this week
re-data / re-data
View on GitHub
re_data - fix data issues before your users & CEO would discover them 😊
☆1,567Apr 30, 2024Updated 2 years ago
calogica / dbt-expectations
View on GitHub
Port(ish) of Great Expectations to dbt test macros
☆1,227Dec 16, 2024Updated last year
nteract / papermill
View on GitHub
📚 Parameterize, execute, and analyze notebooks
☆6,450May 12, 2026Updated last month
ibis-project / ibis
View on GitHub
the portable Python dataframe library
☆6,573Updated this week
OpenLineage / OpenLineage
View on GitHub
An Open Standard for lineage metadata collection
☆2,506Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
MarquezProject / marquez
View on GitHub
Collect, aggregate, and visualize a data ecosystem's metadata
☆2,214Jun 10, 2026Updated last week
pola-rs / polars
View on GitHub
Extremely fast Query Engine for DataFrames, written in Rust
☆38,754Updated this week
evidentlyai / evidently
View on GitHub
Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. Fro…
☆7,607May 2, 2026Updated last month
streamlit / streamlit
View on GitHub
Streamlit — A faster way to build and share data apps.
☆44,994Updated this week
vaexio / vaex
View on GitHub
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per s…
☆8,506Apr 1, 2026Updated 2 months ago
delta-io / delta
View on GitHub
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…
☆8,851Updated this week
ploomber / ploomber
View on GitHub
The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
☆3,624May 29, 2025Updated last year
flyteorg / flyte
View on GitHub
Dynamic, resilient AI orchestration. Coordinate data, models, and compute as you build AI workflows.
☆7,088Updated this week
SQLMesh / sqlmesh
View on GitHub
Scalable and efficient data transformation framework - backwards compatible with dbt.
☆3,140Updated this week
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
apache / superset
View on GitHub
Apache Superset is a Data Visualization and Data Exploration Platform
☆73,298Updated this week
dask / dask
View on GitHub
Parallel computing with task scheduling
☆13,846Jun 11, 2026Updated last week
iterative / cml
View on GitHub
♾️ CML - Continuous Machine Learning | CI/CD for ML
☆4,178Jun 2, 2025Updated last year
tobymao / sqlglot
View on GitHub
Python SQL Parser and Transpiler
☆9,336Updated this week
fugue-project / fugue
View on GitHub
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…
☆2,165May 19, 2026Updated 3 weeks ago
datafold / data-diff
View on GitHub
Compare tables within or across databases
☆2,988May 17, 2024Updated 2 years ago
whylabs / whylogs
View on GitHub
An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model perf…
☆2,822Jan 10, 2025Updated last year