Drop-in replacement for Apache Spark UI
☆409Feb 17, 2026Updated last week
Alternatives and similar repositories for spark
Users that are interested in spark are comparing it to the libraries listed below
Sorting:
- This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spa…☆815Feb 5, 2026Updated 3 weeks ago
- Apache DataFusion Comet Spark Accelerator☆1,142Feb 20, 2026Updated last week
- ☆15Jul 25, 2025Updated 7 months ago
- Apache Spark Kubernetes Operator☆263Updated this week
- Multi-hop declarative data pipelines☆124Feb 12, 2026Updated 2 weeks ago
- This project provides a reverse proxy for Spark UI on Kubernetes☆17Oct 12, 2023Updated 2 years ago
- Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an A…☆134Jan 5, 2026Updated last month
- Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.☆1,516Updated this week
- PySpark test helper methods with beautiful error messages☆753Updated this week
- Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processin…☆1,161Feb 20, 2026Updated last week
- Open Control Plane for Tables in Data Lakehouse☆380Updated this week
- Dashboard for operating Flink jobs and deployments.☆44Jan 31, 2026Updated last month
- Delta Lake helper methods. No Spark dependency.☆22Jan 19, 2026Updated last month
- Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.☆1,039Feb 19, 2026Updated last week
- ☆28Dec 5, 2025Updated 2 months ago
- Uniffle is a high performance, general purpose Remote Shuffle Service.☆446Feb 12, 2026Updated 2 weeks ago
- A Python Library to support running data quality rules while the spark job is running⚡☆200Feb 20, 2026Updated last week
- Python API for Deequ☆813Jan 21, 2026Updated last month
- A tool to benchmark L (loading) workloads within ETL workloads☆31Feb 10, 2026Updated 2 weeks ago
- The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them☆136Oct 25, 2023Updated 2 years ago
- LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.☆1,174Updated this week
- Apache Polaris, the interoperable, open source catalog for Apache Iceberg☆1,845Feb 20, 2026Updated last week
- Basic Spark utilities☆13Feb 20, 2025Updated last year
- Traditionally, engineers were needed to implement business logic via data pipelines before business users can start using it. Using this …☆12Feb 5, 2026Updated 3 weeks ago
- Arrow-Powered Data Exchange☆15Feb 7, 2025Updated last year
- Open, Multi-modal Catalog for Data & AI☆3,313Updated this week
- A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.☆347May 31, 2024Updated last year
- Qubole Sparklens tool for performance tuning Apache Spark☆590Jun 26, 2024Updated last year
- Spark metrics related custom classes and sinks (e.g. Prometheus)☆187Aug 2, 2022Updated 3 years ago
- Helm chart for Lakekeeper - a Rust Native Iceberg REST Catalog☆23Updated this week
- Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…☆94May 9, 2025Updated 9 months ago
- Sample code to collect Apache Iceberg metrics for table monitoring☆29Aug 18, 2024Updated last year
- Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.☆1,196Updated this week
- ☆22Nov 4, 2025Updated 3 months ago
- RemoteStorageManager for Apache Kafka® Tiered Storage☆223Updated this week
- Rust based high-performance Apache Uniffle shuffle-server☆62Feb 10, 2026Updated 2 weeks ago
- Analytics Accelerator Library for Amazon S3 is an open source library that accelerates data access from client applications to Amazon S3.☆65Feb 4, 2026Updated 3 weeks ago
- A write-audit-publish implementation on a data lake without the JVM☆45Aug 12, 2024Updated last year
- World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.☆2,882Feb 21, 2026Updated last week