mrpowers-io / jodieView external linksLinks
Delta lake and filesystem helper methods
☆50Feb 29, 2024Updated last year
Alternatives and similar repositories for jodie
Users that are interested in jodie are comparing it to the libraries listed below
Sorting:
- Delta Lake helper methods. No Spark dependency.☆22Jan 19, 2026Updated 3 weeks ago
- Pandas helper functions☆31Feb 19, 2023Updated 2 years ago
- Delta Acceptance Testing☆23Aug 25, 2025Updated 5 months ago
- Write property based tests easily on spark dataframes☆20Jan 19, 2024Updated 2 years ago
- Delta Lake helper methods in PySpark☆327Jan 19, 2026Updated 3 weeks ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆61Sep 4, 2023Updated 2 years ago
- Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations shou…☆10Jul 31, 2023Updated 2 years ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆682Mar 6, 2025Updated 11 months ago
- SparkConnect Server plugin and protobuf messages for the Amazon Deequ Data Quality Engine.☆26Feb 22, 2025Updated 11 months ago
- Powershell Scripts for Power BI☆13Sep 20, 2023Updated 2 years ago
- Delta Lake Documentation☆53Jun 19, 2024Updated last year
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10May 12, 2023Updated 2 years ago
- A bunch of low-level basic methods for data processing and monitoring with Scala Spark☆10Jun 29, 2018Updated 7 years ago
- Command line client for the Fugue API☆14Mar 7, 2023Updated 2 years ago
- Apache NiFi deployment on OpenShift☆13Jul 18, 2023Updated 2 years ago
- PySpark test helper methods with beautiful error messages☆752Jan 13, 2026Updated last month
- Optics for Spark DataFrames☆47Mar 5, 2021Updated 4 years ago
- Spark Monitoring☆13Feb 28, 2023Updated 2 years ago
- Delta Lake examples☆238Oct 8, 2024Updated last year
- Code that was used as an example during the Data+AI Summit 2020☆15Mar 8, 2021Updated 4 years ago
- PySpark phonetic and string matching algorithms☆41Feb 19, 2024Updated last year
- Writing PySpark logs in Apache Spark and Databricks☆17Jun 13, 2022Updated 3 years ago
- PySpark schema generator☆44Feb 23, 2023Updated 2 years ago
- Type safety for spark columns☆79Oct 27, 2025Updated 3 months ago
- A Delta Lake reader for Dask☆53Jul 29, 2025Updated 6 months ago
- A Minimalistic Rust Implementation of Delta Sharing Server.☆96Mar 17, 2025Updated 10 months ago
- Filling in the Spark function gaps across APIs☆50Apr 14, 2021Updated 4 years ago
- Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)☆454Feb 8, 2026Updated last week
- ☆24Dec 20, 2022Updated 3 years ago
- An example PySpark project with pytest☆18Oct 13, 2017Updated 8 years ago
- [under development] ETL materials to support proposal for CDM enhancements for clinical trial data☆24Jun 25, 2021Updated 4 years ago
- ☆59Jan 3, 2024Updated 2 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆62Sep 6, 2024Updated last year
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆28May 19, 2025Updated 8 months ago
- Data quality control tool built on spark and deequ☆25Jan 22, 2026Updated 3 weeks ago
- Spark operator deployment and usage on OpenShift☆29Nov 25, 2024Updated last year
- Draw.io files for Power BI decision trees.☆25May 2, 2023Updated 2 years ago
- Powerful friendly Ethereum mock node & proxy☆31Jun 7, 2023Updated 2 years ago
- Scala framework for collecting performance metrics and conducting sound experimental benchmarking.☆13Nov 19, 2025Updated 2 months ago