owenrh / spark-firesLinks
Spark fires is a anti-pattern playground where we deliberately break Spark applications in various ways so you can observe what happens and potentially recognise the issue when you come across it in your day-to-day development and support activities.
☆42Updated 7 months ago
Alternatives and similar repositories for spark-fires
Users that are interested in spark-fires are comparing it to the libraries listed below
Sorting:
- A Python Library to support running data quality rules while the spark job is running⚡☆188Updated this week
- Delta Lake helper methods in PySpark☆326Updated 9 months ago
- Modern serverless lakehouse implementing HOOK methodology, Unified Star Schema (USS), and Analytical Data Storage System (ADSS) principle…☆117Updated 2 months ago
- Custom PySpark Data Sources☆56Updated 3 weeks ago
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆70Updated 9 months ago
- Delta Lake examples☆225Updated 8 months ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆218Updated last week
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆254Updated 4 months ago
- Showcase of advanced use cases relating to CI in dbt☆81Updated this week
- PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows☆43Updated this week
- ✨ A Pydantic to PySpark schema library☆96Updated this week
- A flake8 plugin that detects of usage withColumn in a loop or inside reduce☆28Updated last week
- Possibly the fastest DataFrame-agnostic quality check library in town.☆195Updated this week
- Databricks Implementation of the TPC-DI Specification using Traditional Notebooks and/or Delta Live Tables☆85Updated 2 weeks ago
- A DataOps framework for building a lakehouse.☆50Updated this week
- Delta Lake helper methods. No Spark dependency.☆23Updated 9 months ago
- Declarative database change management tool for Snowflake☆123Updated this week
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆64Updated 3 years ago
- ☆80Updated 8 months ago
- Spark style guide☆259Updated 8 months ago
- Slow & local data allows you to move fast and deliver business value for the 99.9% of the data challenges.☆246Updated 2 months ago
- Dagster SQLMesh Adapter☆59Updated 2 weeks ago
- Scalefree's dbt package for a Data Vault 2.0 implementation congruent to the original Data Vault 2.0 definition by Dan Linstedt including…☆157Updated last week
- Step-by-step tutorial on building a Kimball dimensional model with dbt☆142Updated 11 months ago
- Delta lake and filesystem helper methods☆51Updated last year
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆72Updated last year
- Turning PySpark Into a Universal DataFrame API☆408Updated last week
- A lightweight helper utility which allows developers to do interactive pipeline development by having a unified source code for both DLT …☆49Updated 2 years ago
- Dagster University courses☆90Updated this week
- A dbt-core plugin to weave together multi-project dbt-core deployments☆155Updated last week