Multi-container environment with Hadoop, Spark and Hive
☆235May 5, 2025Updated last year
Alternatives and similar repositories for docker-hadoop-spark
Users that are interested in docker-hadoop-spark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆47Jul 4, 2023Updated 2 years ago
- Hadoop-Hive-Spark cluster + Jupyter on Docker☆84Jan 2, 2025Updated last year
- A docker using the airflow with Hadoop ecosystem (hive, spark, and sqoop)☆12May 2, 2021Updated 5 years ago
- Apache Hadoop docker image☆2,322Feb 1, 2024Updated 2 years ago
- Hadoop, Hive, Spark, Zeppelin and Livy: all in one Docker-compose file.☆169Feb 4, 2021Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Dockerizing an Apache Spark Standalone Cluster☆42Jun 29, 2022Updated 3 years ago
- Toy Hadoop cluster combining various SQL-on-Hadoop variants☆13Nov 16, 2017Updated 8 years ago
- Zeppelin docker☆16Nov 16, 2020Updated 5 years ago
- Sample code and documentation for very basic things that I can't remember but want to aggregate in one place☆13Nov 7, 2021Updated 4 years ago
- ☆13Feb 18, 2022Updated 4 years ago
- Module for pipelines concept in PySpark☆17Mar 27, 2024Updated 2 years ago
- Docker with Airflow and Spark standalone cluster☆265Aug 5, 2023Updated 2 years ago
- ☆21Mar 11, 2025Updated last year
- Big Data Docker Data Science Spark Spark4 Hadoop HDFS Scala Python Artificial Intelligence Machine Learning Jupyter Lab Notebook☆19May 31, 2026Updated 2 weeks ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆16Mar 9, 2026Updated 3 months ago
- ☆16Feb 17, 2020Updated 6 years ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Mar 29, 2021Updated 5 years ago
- CI/CD platform using Jenkins, docker, Sonar, Nexus, Jmeter, Selenium, Ansible, AWX, Grafana, Prometheus, Zabbix, Stress-ng☆20May 25, 2026Updated 3 weeks ago
- Apache Spark docker image☆2,051Apr 20, 2026Updated last month
- A prolly tree (probabilistic tree) is a data structure designed to provide efficient storage, retrieval, and modification of ordered data…☆29Jun 2, 2026Updated 2 weeks ago
- Analytics Engineer Course☆20May 17, 2023Updated 3 years ago
- [EXPERIMENTAL] This repo includes deployment instructions for running HDFS/Spark inside docker containers. Also includes spark-notebook a…☆700Oct 1, 2020Updated 5 years ago
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks☆24Apr 2, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A simple spark standalone cluster for your testing environment purposses☆567Mar 6, 2024Updated 2 years ago
- Advent of code - 30 challenges for learning Dagster☆27Dec 19, 2024Updated last year
- Simple Tab Sorter++☆17May 28, 2025Updated last year
- Cloud-native Trino (prestosql) + Hive + Minio + Superset☆23Nov 29, 2021Updated 4 years ago
- Run an open-source data LakeHouse locally using Docker Compose☆12May 31, 2024Updated 2 years ago
- ☆11Feb 19, 2024Updated 2 years ago
- Apply and/or check recommendations from the CIS benchmarks.☆23Mar 18, 2026Updated 3 months ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆100Jan 31, 2023Updated 3 years ago
- Kafka Connect Examples☆43Sep 27, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This is a GitHub for all of my NiFi Templates☆47Mar 25, 2026Updated 2 months ago
- Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker.☆509Nov 7, 2025Updated 7 months ago
- Data Pipeline that utilizes GCP, Python 3.10, Prefect, and more.☆10Jan 23, 2023Updated 3 years ago
- This tool can easily make / build an emr cluster edge node / client node / gateway node☆10Jun 1, 2022Updated 4 years ago
- Automated STIG Benchmark Compliance Audit for RHEL 7 with Ansible & GOSS☆18Nov 18, 2024Updated last year
- Files for the Docker and Kubernetes on Google Cloud Hands-On labs☆11Mar 14, 2023Updated 3 years ago
- Script installing OpenStack Juno in 03 nodes☆21Dec 4, 2015Updated 10 years ago