data-engineering-helpers / data-contracts
Food for thoughts around data contracts
☆24Updated this week
Alternatives and similar repositories for data-contracts:
Users that are interested in data-contracts are comparing it to the libraries listed below
- A simple and easy to use Data Quality (DQ) tool built with Python.☆49Updated last year
- Ingesting data with Pulumi, AWS lambdas and Snowflake in a scalable, fully replayable manner☆71Updated 3 years ago
- A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.☆183Updated last year
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆205Updated this week
- A curated list of awesome blogs, videos, tools and resources about Data Contracts☆171Updated 6 months ago
- A portable Datamart and Business Intelligence suite built with Docker, sqlmesh + dbtcore, DuckDB and Superset☆48Updated 3 months ago
- ☆34Updated 2 years ago
- Data-aware orchestration with dagster, dbt, and airbyte☆32Updated 2 years ago
- Possibly the fastest DataFrame-agnostic quality check library in town.☆182Updated this week
- 🧱 A collection of supplementary utilities and helper notebooks to perform admin tasks on Databricks☆54Updated 2 months ago
- Code snippets for Data Engineering Design Patterns book☆73Updated last month
- Data product portal created by Dataminded☆176Updated this week
- Demo of Streamlit application with Databricks SQL Endpoint☆36Updated 2 years ago
- A write-audit-publish implementation on a data lake without the JVM☆46Updated 6 months ago
- ☆74Updated 4 months ago
- Example repository showing how to build a data platform with Prefect, dbt and Snowflake☆98Updated 2 years ago
- Package to assert rows in-line with dbt macros.☆66Updated 3 months ago
- Generate DBT tests based on sample data☆36Updated last year
- Cost Efficient Data Pipelines with DuckDB☆49Updated 7 months ago
- A Python Library to support running data quality rules while the spark job is running⚡☆174Updated last week
- Fake Pandas / PySpark DataFrame creator☆45Updated 11 months ago
- Supporting materials/code examples for my course in data engineering for machine learning.☆38Updated 2 years ago
- ☆111Updated 7 months ago
- A dbt-Core package for generating models from an activity stream.☆39Updated 10 months ago
- Sample configuration to deploy a modern data platform.☆88Updated 3 years ago
- ☆27Updated 2 years ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 2 years ago
- [DEPRECATED] A dbt adapter for Excel.☆92Updated last year
- Pytest plugin for dbt core☆58Updated last month
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆64Updated 5 months ago