big-data-europe / docker-sparkView external linksLinks
Apache Spark docker image
☆2,058Apr 21, 2023Updated 2 years ago
Alternatives and similar repositories for docker-spark
Users that are interested in docker-spark are comparing it to the libraries listed below
Sorting:
- Apache Hadoop docker image☆2,312Feb 1, 2024Updated 2 years ago
- [EXPERIMENTAL] This repo includes deployment instructions for running HDFS/Spark inside docker containers. Also includes spark-notebook a…☆698Oct 1, 2020Updated 5 years ago
- A simple spark standalone cluster for your testing environment purposses☆569Mar 6, 2024Updated last year
- Apache Flink docker image☆197Jul 1, 2022Updated 3 years ago
- ☆1,080Jun 2, 2024Updated last year
- Docker build for Apache Spark☆672Dec 30, 2021Updated 4 years ago
- ☆251Nov 15, 2022Updated 3 years ago
- ☆32Mar 7, 2018Updated 7 years ago
- Demo Spark application to transform data gathered on sensors for a heatmap application☆33May 29, 2017Updated 8 years ago
- Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.☆3,100Feb 6, 2026Updated last week
- Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker.☆507Nov 7, 2025Updated 3 months ago
- Apache Spark - A unified analytics engine for large-scale data processing☆42,767Feb 7, 2026Updated last week
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆8,575Updated this week
- Ready-to-run Docker images containing Jupyter applications☆8,412Updated this week
- Docker Apache Airflow☆3,814Mar 1, 2023Updated 2 years ago
- Dockerfile for Apache Kafka☆6,981May 8, 2024Updated last year
- ☆762Mar 11, 2021Updated 4 years ago
- ☆26Nov 22, 2022Updated 3 years ago
- Hadoop docker image☆1,208Jun 25, 2020Updated 5 years ago
- 50+ DockerHub public images for Docker & Kubernetes - DevOps, CI/CD, GitHub Actions, CircleCI, Jenkins, TeamCity, Alpine, CentOS, Debian,…☆1,376Feb 3, 2026Updated last week
- Base classes to use when writing tests with Spark☆1,549Dec 22, 2025Updated last month
- A curated list of awesome Apache Spark packages and resources.☆1,862Oct 24, 2024Updated last year
- Apache Spark docker container image (Standalone mode)☆35Oct 16, 2020Updated 5 years ago
- Upserts, Deletes And Incremental Processing on Big Data.☆6,087Updated this week
- Jupyter magics and kernels for working with remote Spark clusters☆1,363Sep 9, 2025Updated 5 months ago
- REST job server for Apache Spark☆2,845Jul 8, 2025Updated 7 months ago
- Docker multi-nodes Hadoop cluster with Spark 2.4.1 on Yarn☆51Dec 7, 2020Updated 5 years ago
- spark on kubernetes☆104Feb 20, 2023Updated 2 years ago
- A connector for Spark that allows reading and writing to/from Redis cluster☆947Oct 22, 2024Updated last year
- Apache Airflow - A platform to programmatically author, schedule, and monitor workflows☆44,172Feb 7, 2026Updated last week
- The Metadata Platform for your Data and AI Stack☆11,545Feb 7, 2026Updated last week
- Interactive and Reactive Data Science using Scala and Spark.☆3,151May 16, 2023Updated 2 years ago
- Postgresql configured to work as metastore for Hive.☆32Dec 16, 2022Updated 3 years ago
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,580Feb 2, 2026Updated last week
- TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.☆3,859Jul 10, 2023Updated 2 years ago
- ETL best practices with airflow, with examples☆1,353Sep 25, 2024Updated last year
- Hadoop, Hive, Spark, Zeppelin and Livy: all in one Docker-compose file.☆171Feb 4, 2021Updated 5 years ago
- Spark Notebook docker image☆10Dec 29, 2017Updated 8 years ago
- Base Docker image with just essentials: Hadoop, Hive and Spark.☆69Feb 3, 2021Updated 5 years ago