data-engineering-helpers / data-contracts
Food for thoughts around data contracts
☆24Updated this week
Related projects ⓘ
Alternatives and complementary repositories for data-contracts
- A curated list of awesome blogs, videos, tools and resources about Data Contracts☆166Updated 3 months ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆189Updated this week
- Demo of Streamlit application with Databricks SQL Endpoint☆33Updated 2 years ago
- Example repo to kickstart integration with mlflow pipelines.☆73Updated 2 years ago
- Data product portal created by Dataminded☆148Updated this week
- A Python Library to support running data quality rules while the spark job is running⚡☆163Updated last week
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆62Updated last month
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆53Updated 2 months ago
- A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.☆22Updated 7 months ago
- Possibly the fastest DataFrame-agnostic quality check library in town.☆174Updated this week
- ☆48Updated 4 months ago
- ✨ A Pydantic to PySpark schema library☆56Updated this week
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 2 years ago
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆195Updated this week
- A portable Datamart and Business Intelligence suite built with Docker, sqlmesh + dbtcore, DuckDB and Superset☆37Updated last week
- Demo DAGs that show how to run dbt Core in Airflow using Cosmos☆46Updated last month
- Ingesting data with Pulumi, AWS lambdas and Snowflake in a scalable, fully replayable manner☆69Updated 2 years ago
- A simple and easy to use Data Quality (DQ) tool built with Python.☆48Updated last year
- A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.☆180Updated last year
- A dbt-core python package that automates the management and creation of dbt groups, contracts, access, and versions.☆110Updated 4 months ago
- PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows☆41Updated 4 months ago
- In this repository we will store all materials for workshops, courses, etc.☆35Updated this week
- Delta Lake helper methods in PySpark☆304Updated 2 months ago
- ☆20Updated 3 years ago
- Home of the Open Data Contract Standard (ODCS).☆392Updated last week
- Code snippets for Data Engineering Design Patterns book☆40Updated last week
- Great Expectations Airflow operator☆159Updated 3 weeks ago
- Fake Pandas / PySpark DataFrame creator☆42Updated 8 months ago