Official Dockerfile for Apache Spark
☆166Feb 18, 2026Updated last month
Alternatives and similar repositories for spark-docker
Users that are interested in spark-docker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Apache Spark Kubernetes Operator☆272Updated this week
- A set of transformations for Kafka Connect☆22Mar 1, 2026Updated last month
- ☆18Nov 4, 2024Updated last year
- A Spark data source for reading Microsoft Excel files☆13Jul 1, 2024Updated last year
- Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.☆3,111Apr 6, 2026Updated last week
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Spark-Dashboard is an open-source monitoring solution for Apache Spark that provides real-time performance dashboards using containers an…☆134Apr 1, 2026Updated 2 weeks ago
- A tool to get better debug info on spark's memory usage☆42Aug 21, 2019Updated 6 years ago
- Sparglim✨ makes PySpark App Configurable and Deploy Spark Connect Server Easier!☆42Jan 19, 2026Updated 2 months ago
- Python API for Deequ☆815Mar 9, 2026Updated last month
- Apache Spark Connect Client for Golang☆249Oct 13, 2025Updated 6 months ago
- Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.☆1,042Updated this week
- Docker packaging for Apache Flink☆358Dec 3, 2025Updated 4 months ago
- Helm charts for Trino and Trino Gateway☆194Mar 30, 2026Updated 2 weeks ago
- ☆12Feb 18, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- The official Model Context Protocol (MCP) server for DataHub (https://datahub.com)☆71Updated this week
- The web application is a Data Caching Service designed and implemented using microservices architecture.☆13Jun 14, 2021Updated 4 years ago
- A library for reading data from Amzon S3 with optimised listing using Amazon SQS using Spark SQL Streaming ( or Structured streaming).☆18Apr 20, 2024Updated last year
- Enables automatic refactoring and linting of Maven projects written in Scala using Scalafix.☆26Mar 27, 2026Updated 2 weeks ago
- Collection of NiFi-related stuff☆24Oct 27, 2022Updated 3 years ago
- Apache DataFusion Comet Spark Accelerator☆1,163Updated this week
- Apache flink☆20Jan 26, 2026Updated 2 months ago
- This repository contains code samples shared on https://dev.java/ and https://inside.java/☆14Jun 16, 2024Updated last year
- Apache Spark Website☆134Apr 8, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Maven packaging and lifecycle for Trino plugins☆15Apr 3, 2026Updated last week
- A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer☆52Oct 30, 2023Updated 2 years ago
- Official Java implementation of Apache Arrow☆82Updated this week
- A re-implementation of Hadoop DistCP in Apache Spark☆47Dec 20, 2023Updated 2 years ago
- ☆20Nov 17, 2025Updated 4 months ago
- A JupyterHub authenticator using Kerberos☆12Jul 16, 2019Updated 6 years ago
- Official Dockerfile for Delta Lake☆61Feb 24, 2026Updated last month
- Flux Operator Helm Charts☆24Apr 7, 2026Updated last week
- A library that provides useful extensions to Apache Spark and PySpark.☆235Mar 18, 2026Updated 3 weeks ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for Apache Hudi, Apache Iceberg and Delta Lake analysis☆10Feb 2, 2024Updated 2 years ago
- A tool for translating Scala source code into readable and maintainable Java code☆13Jan 3, 2026Updated 3 months ago
- Example of how to build machine learning training workflow on AWS by Prefect☆12Nov 2, 2022Updated 3 years ago
- trino + hive + minio with postgres in docker compose☆27Aug 18, 2023Updated 2 years ago
- Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.☆2,322Updated this week
- Adapter for dbt that executes dbt pipelines on Apache Flink☆98Mar 19, 2024Updated 2 years ago
- Docker image for Spark history server on Kubernetes☆15Mar 13, 2020Updated 6 years ago