awslabs / python-deequLinks
Python API for Deequ
β809Updated 2 weeks ago
Alternatives and similar repositories for python-deequ
Users that are interested in python-deequ are comparing it to the libraries listed below
Sorting:
- PySpark test helper methods with beautiful error messagesβ750Updated 3 weeks ago
- pyspark methods to enhance developer productivity π£ π― πβ682Updated 11 months ago
- Delta Lake helper methods in PySparkβ327Updated 2 weeks ago
- Apache Airflow integration for dbtβ411Updated last year
- This repository has moved into https://github.com/dbt-labs/dbt-adaptersβ443Updated 6 months ago
- Template for a data contract used in a data mesh.β486Updated last year
- Great Expectations Airflow operatorβ170Updated last week
- Data Contracts engine for the modern data stack. https://www.soda.ioβ2,281Updated this week
- Port(ish) of Great Expectations to dbt test macrosβ1,204Updated last year
- Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.β375Updated 8 months ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflowβ224Updated 2 months ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for severβ¦β279Updated 4 months ago
- Data pipeline with dbt, Airflow, Great Expectationsβ166Updated 4 years ago
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.β168Updated 2 years ago
- This repository has moved into https://github.com/dbt-labs/dbt-adaptersβ250Updated last year
- A Python Library to support running data quality rules while the spark job is runningβ‘β197Updated this week
- CLI that makes it easy to create, test and deploy Airflow DAGs to Astronomerβ437Updated this week
- The athena adapter plugin for dbt (https://getdbt.com)β139Updated 2 years ago
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.β3,578Updated this week
- Construct Apache Airflow DAGs Declaratively via YAML configuration filesβ1,413Updated last week
- β201Updated 2 years ago
- Snowflake Snowpark Python APIβ325Updated this week
- Delta Lake examplesβ238Updated last year
- The easiest way to run Airflow locally, with linting & tests for valid DAGs and Plugins.β258Updated 4 years ago
- A repository of sample code to accompany our blog post on Airflow and dbt.β183Updated 2 years ago
- Pandas, Polars, Spark, and Snowpark DataFrame comparison for humans and more!β628Updated last week
- A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.β211Updated last month
- An open protocol for secure data sharingβ917Updated 2 weeks ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframesβ63Updated 3 years ago
- This dbt package contains macros to support unit testing that can be (re)used across dbt projects.β448Updated 11 months ago