PySpark test helper methods with beautiful error messages
β753Feb 25, 2026Updated this week
Alternatives and similar repositories for chispa
Users that are interested in chispa are comparing it to the libraries listed below
Sorting:
- pyspark methods to enhance developer productivity π£ π― πβ683Mar 6, 2025Updated 11 months ago
- Delta Lake helper methods in PySparkβ327Jan 19, 2026Updated last month
- A Python Library to support running data quality rules while the spark job is runningβ‘β200Updated this week
- Spark style guideβ272Sep 30, 2024Updated last year
- This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurringβ¦β1,227Sep 8, 2025Updated 5 months ago
- Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)β454Feb 8, 2026Updated 3 weeks ago
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflowβ228Feb 11, 2026Updated 2 weeks ago
- A library that provides useful extensions to Apache Spark and PySpark.β232Jan 20, 2026Updated last month
- Delta lake and filesystem helper methodsβ50Feb 29, 2024Updated 2 years ago
- Testing framework for Databricks notebooksβ316Apr 20, 2024Updated last year
- Delta Lake examplesβ240Oct 8, 2024Updated last year
- Python API for Deequβ814Jan 21, 2026Updated last month
- Delta Lake helper methods. No Spark dependency.β22Jan 19, 2026Updated last month
- Essential Spark extensions and helper methods β¨π²β766Sep 14, 2025Updated 5 months ago
- Fake Pandas / PySpark DataFrame creatorβ48Mar 10, 2024Updated last year
- This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spaβ¦β816Updated this week
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.β3,588Feb 17, 2026Updated 2 weeks ago
- A platform and cloud-based service for data sharing based on the Delta Sharing protocol.β21Jun 12, 2024Updated last year
- Data Contracts engine for the modern data stack. https://www.soda.ioβ2,298Updated this week
- Open, Multi-modal Catalog for Data & AIβ3,320Updated this week
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trβ¦β8,608Updated this week
- Base classes to use when writing tests with Sparkβ1,549Dec 22, 2025Updated 2 months ago
- A native Rust library for Delta Lake, with bindings into Pythonβ3,156Updated this week
- Column-wise type annotations for pyspark DataFramesβ95Updated this week
- Always know what to expect from your data.β11,197Updated this week
- Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations shouβ¦β10Jul 31, 2023Updated 2 years ago
- An Open Standard for lineage metadata collectionβ2,330Updated this week
- pytest plugin to run the tests with support of pysparkβ88May 21, 2025Updated 9 months ago
- Drop-in replacement for Apache Spark UIβ413Feb 17, 2026Updated 2 weeks ago
- β¨ A Pydantic to PySpark schema libraryβ121Updated this week
- Demo of using the Nutter for testing of Databricks notebooks in the CI/CD pipelineβ151Aug 14, 2024Updated last year
- Yet Another (Spark) ETL Frameworkβ21Oct 21, 2023Updated 2 years ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for severβ¦β282Oct 7, 2025Updated 4 months ago
- PyJaws: A Pythonic Way to Define Databricks Jobs and Workflowsβ45Jan 24, 2026Updated last month
- Turning PySpark Into a Universal DataFrame APIβ493Updated this week
- Scalable and efficient data transformation framework - backwards compatible with dbt.β2,928Updated this week
- A library that brings useful functions from various modern database management systems to Apache Sparkβ61Sep 4, 2023Updated 2 years ago
- Construct Apache Airflow DAGs Declaratively via YAML configuration filesβ1,418Updated this week
- Data Lineage Tracking And Visualization Solutionβ656Updated this week