pyspark methods to enhance developer productivity 📣 👯 🎉
☆683Mar 6, 2025Updated last year
Alternatives and similar repositories for quinn
Users that are interested in quinn are comparing it to the libraries listed below
Sorting:
- PySpark test helper methods with beautiful error messages☆753Feb 25, 2026Updated last week
- Essential Spark extensions and helper methods ✨😲☆766Sep 14, 2025Updated 5 months ago
- Delta Lake helper methods in PySpark☆327Jan 19, 2026Updated last month
- Spark style guide☆272Sep 30, 2024Updated last year
- A library that provides useful extensions to Apache Spark and PySpark.☆232Jan 20, 2026Updated last month
- Delta lake and filesystem helper methods☆50Feb 29, 2024Updated 2 years ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆61Sep 4, 2023Updated 2 years ago
- Python API for Deequ☆814Jan 21, 2026Updated last month
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆228Feb 11, 2026Updated 3 weeks ago
- A Python Library to support running data quality rules while the spark job is running⚡☆200Feb 27, 2026Updated last week
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,588Feb 17, 2026Updated 2 weeks ago
- Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)☆454Feb 8, 2026Updated 3 weeks ago
- Testing framework for Databricks notebooks☆316Apr 20, 2024Updated last year
- This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spa…☆816Feb 27, 2026Updated last week
- Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations shou…☆10Jul 31, 2023Updated 2 years ago
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive☆187Oct 15, 2025Updated 4 months ago
- This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring…☆1,227Sep 8, 2025Updated 5 months ago
- PySpark phonetic and string matching algorithms☆41Feb 19, 2024Updated 2 years ago
- Implementing best practices for PySpark ETL jobs and applications.☆2,075Jan 1, 2023Updated 3 years ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,540Dec 2, 2024Updated last year
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10May 12, 2023Updated 2 years ago
- A curated list of awesome Apache Spark packages and resources.☆1,862Feb 27, 2026Updated last week
- Apache (Py)Spark type annotations (stub files).☆118Aug 17, 2022Updated 3 years ago
- Asynchronous actions for PySpark☆48Dec 2, 2021Updated 4 years ago
- Base classes to use when writing tests with Spark☆1,549Dec 22, 2025Updated 2 months ago
- Fake Pandas / PySpark DataFrame creator☆48Mar 10, 2024Updated last year
- A boilerplate for writing PySpark Jobs☆395Jan 21, 2024Updated 2 years ago
- Qubole Sparklens tool for performance tuning Apache Spark☆590Jun 26, 2024Updated last year
- Delta Lake helper methods. No Spark dependency.☆22Jan 19, 2026Updated last month
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆8,608Updated this week
- Filling in the Spark function gaps across APIs☆50Apr 14, 2021Updated 4 years ago
- Delta Lake examples☆240Oct 8, 2024Updated last year
- Joblib Apache Spark Backend☆249Apr 7, 2025Updated 10 months ago
- A native Rust library for Delta Lake, with bindings into Python☆3,160Updated this week
- Helpers & syntactic sugar for PySpark.☆62Dec 4, 2025Updated 3 months ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,139Feb 21, 2026Updated last week
- Spline agent for Apache Spark☆202Feb 25, 2026Updated last week
- A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.☆347May 31, 2024Updated last year
- Always know what to expect from your data.☆11,197Feb 27, 2026Updated last week