Multi-container environment with Hadoop, Spark and Hive
☆235May 5, 2025Updated last year
Alternatives and similar repositories for docker-hadoop-spark
Users that are interested in docker-hadoop-spark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆47Jul 4, 2023Updated 2 years ago
- Run Hadoop Cluster within Docker Containers.☆16Mar 6, 2025Updated last year
- A docker using the airflow with Hadoop ecosystem (hive, spark, and sqoop)☆12May 2, 2021Updated 5 years ago
- Apache Hadoop docker image☆2,323Feb 1, 2024Updated 2 years ago
- Dockerizing an Apache Spark Standalone Cluster☆42Jun 29, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Zeppelin docker☆16Nov 16, 2020Updated 5 years ago
- Sample code and documentation for very basic things that I can't remember but want to aggregate in one place☆13Nov 7, 2021Updated 4 years ago
- Open episode of the data engineering practice course☆32Jul 2, 2024Updated last year
- Module for pipelines concept in PySpark☆17Mar 27, 2024Updated 2 years ago
- Docker with Airflow and Spark standalone cluster☆264Aug 5, 2023Updated 2 years ago
- ☆21Mar 11, 2025Updated last year
- Big Data Docker Data Science Spark Spark4 Hadoop HDFS Scala Python Artificial Intelligence Machine Learning Jupyter Lab Notebook☆19Updated this week
- ☆16Mar 9, 2026Updated 2 months ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Mar 29, 2021Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The rust implementation of the Defluencer protocol.☆13Sep 3, 2025Updated 8 months ago
- Apache Spark docker image☆2,049Apr 20, 2026Updated last month
- My solutions from Google Foobar. For educational purposes only.☆10Sep 21, 2017Updated 8 years ago
- ☆11Nov 26, 2024Updated last year
- ☆14Apr 2, 2022Updated 4 years ago
- A prolly tree (probabilistic tree) is a data structure designed to provide efficient storage, retrieval, and modification of ordered data…☆28May 21, 2026Updated last week
- Analytics Engineer Course☆20May 17, 2023Updated 3 years ago
- ☆12Mar 5, 2021Updated 5 years ago
- [EXPERIMENTAL] This repo includes deployment instructions for running HDFS/Spark inside docker containers. Also includes spark-notebook a…☆700Oct 1, 2020Updated 5 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks☆24Apr 2, 2022Updated 4 years ago
- A simple spark standalone cluster for your testing environment purposses☆567Mar 6, 2024Updated 2 years ago
- fedex-commercial-invoice☆21Apr 28, 2016Updated 10 years ago
- Simple Tab Sorter++☆17May 28, 2025Updated last year
- Cloud-native Trino (prestosql) + Hive + Minio + Superset☆23Nov 29, 2021Updated 4 years ago
- ☆10May 5, 2022Updated 4 years ago
- Apply and/or check recommendations from the CIS benchmarks.☆23Mar 18, 2026Updated 2 months ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆101Jan 31, 2023Updated 3 years ago
- This is a GitHub for all of my NiFi Templates☆47Mar 25, 2026Updated 2 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A Hadoop cluster based on Docker, including Hive and Spark.☆82Nov 13, 2022Updated 3 years ago
- Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker.☆508Nov 7, 2025Updated 6 months ago
- A curated list of awesome DuckLake tools and resources☆124Apr 20, 2026Updated last month
- Experiments produced during an end of studies project (ETS, H2018)☆14Nov 21, 2018Updated 7 years ago
- This repo contains the code of the paper "RayJoin: Fast and Precise Spatial Join", ICS'24☆12May 21, 2026Updated last week
- Files for the Docker and Kubernetes on Google Cloud Hands-On labs☆11Mar 14, 2023Updated 3 years ago
- Docker for airflow with mysql as backend☆12Nov 15, 2018Updated 7 years ago