owenrh / spark-firesLinks
Spark fires is a anti-pattern playground where we deliberately break Spark applications in various ways so you can observe what happens and potentially recognise the issue when you come across it in your day-to-day development and support activities.
☆42Updated 10 months ago
Alternatives and similar repositories for spark-fires
Users that are interested in spark-fires are comparing it to the libraries listed below
Sorting:
- Delta Lake helper methods in PySpark☆324Updated last year
- A Python Library to support running data quality rules while the spark job is running⚡☆188Updated this week
- A flake8 plugin that detects of usage withColumn in a loop or inside reduce☆28Updated 3 months ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆219Updated 2 months ago
- Modern serverless lakehouse implementing HOOK methodology, Unified Star Schema (USS), and Analytical Data Storage System (ADSS) principle…☆115Updated 6 months ago
- Turning PySpark Into a Universal DataFrame API☆434Updated this week
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆268Updated last month
- A portable Datamart and Business Intelligence suite built with Docker, Dagster, dbt, DuckDB and Superset☆245Updated this week
- ☆80Updated 11 months ago
- A portable Datamart and Business Intelligence suite built with Docker, sqlmesh + dbtcore, DuckDB and Superset☆53Updated 11 months ago
- A declarative PySpark framework for row- and aggregate-level data quality validation.☆59Updated last week
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆73Updated this week
- A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.☆208Updated last month
- Delta Lake examples☆229Updated last year
- Quickstart for any service☆161Updated last week
- Dagster SQLMesh Adapter☆72Updated 3 weeks ago
- A dbt-core plugin to weave together multi-project dbt-core deployments☆169Updated last month
- A lightweight Python-based tool for extracting and analyzing data column lineage for dbt projects☆182Updated 6 months ago
- Linter for dbt metadata☆177Updated this week
- Dagster University courses☆112Updated last week
- PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows☆43Updated 3 months ago
- ✨ A Pydantic to PySpark schema library☆106Updated this week
- Slow & local data allows you to move fast and deliver business value for the 99.9% of the data challenges.☆301Updated last week
- Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.☆135Updated last week
- Possibly the fastest DataFrame-agnostic quality check library in town.☆220Updated this week
- Schema modelling framework for decentralised domain-driven ownership of data.☆259Updated last year
- Demo DAGs that show how to run dbt Core in Airflow using Cosmos☆64Updated 4 months ago
- Showcase of advanced use cases relating to CI in dbt☆89Updated 2 weeks ago
- ☆155Updated 2 months ago
- A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.☆26Updated last year