emmc15 / pyspark-testing-envLinks
Example Repo to have full end to end pyspark testing via docker-compose
☆31Updated 2 years ago
Alternatives and similar repositories for pyspark-testing-env
Users that are interested in pyspark-testing-env are comparing it to the libraries listed below
Sorting:
- Delta Lake helper methods in PySpark☆324Updated last year
- PySpark test helper methods with beautiful error messages☆730Updated 2 months ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆275Updated last month
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆222Updated last week
- A Python Library to support running data quality rules while the spark job is running⚡☆193Updated this week
- Quickstart for any service☆167Updated this week
- Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.☆375Updated 6 months ago
- Template for a data contract used in a data mesh.☆484Updated last year
- Turning PySpark Into a Universal DataFrame API☆455Updated last week
- The easiest way to run Airflow locally, with linting & tests for valid DAGs and Plugins.☆257Updated 4 years ago
- Slow & local data allows you to move fast and deliver business value for the 99.9% of the data challenges.☆325Updated 2 months ago
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆77Updated last week
- Code for "Efficient Data Processing in Spark" Course☆347Updated last month
- Modern serverless lakehouse implementing HOOK methodology, Unified Star Schema (USS), and Analytical Data Storage System (ADSS) principle…☆121Updated 8 months ago
- Python API for Deequ☆806Updated 7 months ago
- This repository has moved into https://github.com/dbt-labs/dbt-adapters☆251Updated 9 months ago
- Quick Guides from Dremio on Several topics☆79Updated last week
- Dagster Labs' open-source data platform, built with Dagster.☆420Updated this week
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆223Updated 7 months ago
- Delta Lake examples☆233Updated last year
- Schema modelling framework for decentralised domain-driven ownership of data.☆259Updated last year
- A portable Datamart and Business Intelligence suite built with Docker, Dagster, dbt, DuckDB and Superset☆256Updated last month
- Data pipeline with dbt, Airflow, Great Expectations☆165Updated 4 years ago
- Great Expectations Airflow operator☆169Updated last week
- Apache Airflow integration for dbt☆411Updated last year
- A lightweight Python-based tool for extracting and analyzing data column lineage for dbt projects☆192Updated 8 months ago
- New Generation Opensource Data Stack Demo☆452Updated 2 years ago
- A Database Change Management tool for Snowflake☆602Updated this week
- Possibly the fastest DataFrame-agnostic quality check library in town.☆227Updated last month
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆168Updated 2 years ago