Multi-container environment with Hadoop, Spark and Hive
☆235May 5, 2025Updated last year
Alternatives and similar repositories for docker-hadoop-spark
Users that are interested in docker-hadoop-spark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆47Jul 4, 2023Updated 2 years ago
- Apache Hadoop docker image☆2,321Feb 1, 2024Updated 2 years ago
- Hadoop, Hive, Spark, Zeppelin and Livy: all in one Docker-compose file.☆170Feb 4, 2021Updated 5 years ago
- Dockerizing an Apache Spark Standalone Cluster☆42Jun 29, 2022Updated 3 years ago
- Toy Hadoop cluster combining various SQL-on-Hadoop variants☆13Nov 16, 2017Updated 8 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Zeppelin docker☆16Nov 16, 2020Updated 5 years ago
- Sample code and documentation for very basic things that I can't remember but want to aggregate in one place☆13Nov 7, 2021Updated 4 years ago
- Open episode of the data engineering practice course☆32Jul 2, 2024Updated last year
- ☆14Mar 11, 2023Updated 3 years ago
- Docker with Airflow and Spark standalone cluster☆263Aug 5, 2023Updated 2 years ago
- Big Data Docker Data Science Spark Spark4 Hadoop HDFS Scala Python Artificial Intelligence Machine Learning Jupyter Lab Notebook☆19Apr 26, 2026Updated last week
- ☆16Mar 9, 2026Updated 2 months ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Mar 29, 2021Updated 5 years ago
- CI/CD platform using Jenkins, docker, Sonar, Nexus, Jmeter, Selenium, Ansible, AWX, Grafana, Prometheus, Zabbix, Stress-ng☆21Apr 26, 2026Updated last week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Apache Spark docker image☆2,050Apr 20, 2026Updated 2 weeks ago
- Dockerfiles and Docker Compose for HDP 2.6 with Blueprints☆23Jan 16, 2018Updated 8 years ago
- ☆10Jun 8, 2016Updated 9 years ago
- Cloud based Data Platform based on Apache Spark☆28Updated this week
- ☆11Nov 26, 2024Updated last year
- Analytics Engineer Course☆20May 17, 2023Updated 2 years ago
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆43Sep 26, 2023Updated 2 years ago
- [EXPERIMENTAL] This repo includes deployment instructions for running HDFS/Spark inside docker containers. Also includes spark-notebook a…☆700Oct 1, 2020Updated 5 years ago
- A simple spark standalone cluster for your testing environment purposses☆567Mar 6, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Docker with hadoop spark pig hive☆27Jul 22, 2019Updated 6 years ago
- Advent of code - 30 challenges for learning Dagster☆27Dec 19, 2024Updated last year
- fedex-commercial-invoice☆21Apr 28, 2016Updated 10 years ago
- Simple Tab Sorter++☆16May 28, 2025Updated 11 months ago
- covenant.tistory.com 예제 코드☆13Jan 14, 2023Updated 3 years ago
- ☆46Jan 29, 2024Updated 2 years ago
- Cloud-native Trino (prestosql) + Hive + Minio + Superset☆23Nov 29, 2021Updated 4 years ago
- ☆10May 5, 2022Updated 4 years ago
- A simple GUI made for creating jobs in YOLOv5☆15Mar 16, 2026Updated last month
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker.☆507Nov 7, 2025Updated 6 months ago
- Experiments produced during an end of studies project (ETS, H2018)☆14Nov 21, 2018Updated 7 years ago
- A curated list of awesome DuckLake tools and resources☆116Apr 20, 2026Updated 2 weeks ago
- Data Pipeline that utilizes GCP, Python 3.10, Prefect, and more.☆10Jan 23, 2023Updated 3 years ago
- This tool can easily make / build an emr cluster edge node / client node / gateway node☆10Jun 1, 2022Updated 3 years ago
- Files for the Docker and Kubernetes on Google Cloud Hands-On labs☆11Mar 14, 2023Updated 3 years ago
- Script installing OpenStack Juno in 03 nodes☆21Dec 4, 2015Updated 10 years ago