emmc15 / pyspark-testing-env
Example Repo to have full end to end pyspark testing via docker-compose
☆30Updated last year
Alternatives and similar repositories for pyspark-testing-env:
Users that are interested in pyspark-testing-env are comparing it to the libraries listed below
- A Python Library to support running data quality rules while the spark job is running⚡☆167Updated last week
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆192Updated 3 weeks ago
- Delta Lake helper methods in PySpark☆312Updated 4 months ago
- PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows☆41Updated 6 months ago
- Great Expectations Airflow operator☆160Updated 2 months ago
- ☆43Updated 3 years ago
- A curated collection of publicly available resources on dbt best practices and how data-driven organizations around the world utilize dbt☆112Updated 2 years ago
- Quick Guides from Dremio on Several topics☆67Updated 2 months ago
- ☆72Updated 3 months ago
- A lightweight Python-based tool for extracting and analyzing data column lineage for dbt projects☆125Updated last month
- dbt Cloud command line interface (CLI)☆73Updated 10 months ago
- A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.☆23Updated 9 months ago
- Delta Lake Documentation☆48Updated 6 months ago
- A dbt-core python package that automates the management and creation of dbt groups, contracts, access, and versions.☆113Updated 6 months ago
- This repository contains source code for dbt package dbt_snow_mask.☆61Updated 5 months ago
- A dbt package for easily using production data in a development environment.☆37Updated 8 months ago
- Tools to handle dbt Jobs as well-defined YAML files☆47Updated last week
- A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.☆183Updated last year
- An integration for dbt and fzf that allows interactive selection and search of dbt models.☆67Updated last year
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆229Updated 2 months ago
- Spark fires is a anti-pattern playground where we deliberately break Spark applications in various ways so you can observe what happens a…☆41Updated last month
- Macros for generating dbt model data profiles☆83Updated last month
- Delta Lake helper methods. No Spark dependency.☆22Updated 4 months ago
- Showcase of advanced use cases relating to CI in dbt☆69Updated this week