Big Data Ecosystem Docker
☆429Apr 29, 2023Updated 3 years ago
Alternatives and similar repositories for bigdata_docker
Users that are interested in bigdata_docker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Big Data Ecosystem Docker☆80May 17, 2022Updated 4 years ago
- Modern Data Stack☆63Aug 8, 2025Updated 9 months ago
- Hadoop-Hive-Spark cluster + Jupyter on Docker☆84Jan 2, 2025Updated last year
- Hadoop, Hive, Spark, Zeppelin and Livy: all in one Docker-compose file.☆169Feb 4, 2021Updated 5 years ago
- Docker-compose contains the most common big data systems like: Apache Hadoop, Apache Hive, Apache Spark, Jupyter, Flink☆28Oct 9, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Bootcamp Online - Data Engineering desenvolvido pela IGTI - https://www.igti.com.br/☆11Dec 7, 2020Updated 5 years ago
- Run Hadoop Cluster within Docker Containers.☆16Mar 6, 2025Updated last year
- Presenting 3 ways to run Spark over containers, this project is recommended to those who seek to explore Big Data out of a Hadoop Cluster…☆11Nov 25, 2020Updated 5 years ago
- Data Engineering made simple - An opinionated Data Engineering framework☆66Mar 20, 2024Updated 2 years ago
- Base Docker image with just essentials: Hadoop, Hive and Spark.☆68Feb 3, 2021Updated 5 years ago
- The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Pos…☆80Feb 27, 2023Updated 3 years ago
- Deploy bigdata platform using docker compose. Big data components include hadoop, hive, hbase, presto, flink, es, kafka, etc.☆150Sep 23, 2024Updated last year
- ETL e visualização do Censo escolar☆10May 3, 2023Updated 3 years ago
- Build Elastic Stack using Docker☆22Oct 31, 2025Updated 6 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆24Aug 9, 2023Updated 2 years ago
- Metadata Comparison Toolkit. As of now, V-1.0.0 only consists Comparison of two DDL file ( .sql ) or two DDL statement. You can also pars…☆11Feb 2, 2024Updated 2 years ago
- ☆23May 16, 2023Updated 3 years ago
- ☆41Jul 23, 2024Updated last year
- The demo of using Kafka, Spark, Hive, Cassandra, etc by using Docker. It produces the production ready environment for any kinds of big d…☆37Sep 27, 2019Updated 6 years ago
- Sample Docker Compose files for running Apache Ambari☆11Oct 29, 2018Updated 7 years ago
- Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker.☆508Nov 7, 2025Updated 6 months ago
- Repository to place/show my python apps☆20Feb 2, 2022Updated 4 years ago
- Big Data infrastructure with Hadoop, Spark, Hive and NiFi deployed using Docker Compose. https://doi.org/10.5281/zenodo.18968438☆21Mar 11, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Docker multi-nodes Hadoop cluster with Spark 2.4.1 on Yarn☆50Dec 7, 2020Updated 5 years ago
- Apache Spark docker image☆2,049Apr 20, 2026Updated last month
- ☆11Mar 15, 2025Updated last year
- Estudos e projetos.☆61Jan 14, 2022Updated 4 years ago
- PySpark course.☆10Feb 21, 2022Updated 4 years ago
- 🍕 Repositório para juntar informações sobre materiais de estudo em análise de dados e áreas afins, empresas que trabalham com dados e di…☆2,433Apr 5, 2024Updated 2 years ago
- Ciência de dados☆12Aug 25, 2022Updated 3 years ago
- Airflow plugins for implementing data pipelines. | Plugins do Airflow para implementação de pipelines de dados.☆51Apr 29, 2026Updated 3 weeks ago
- Modelo SEIR para infecção COVID-19, incluindo diferentes trajetórias clínicas de infecção (Brasil)☆11Apr 7, 2020Updated 6 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆32Aug 18, 2021Updated 4 years ago
- Apache Hadoop docker image☆2,323Feb 1, 2024Updated 2 years ago
- Zeppelin docker☆16Nov 16, 2020Updated 5 years ago
- Solução completa de Data Science: Recomendador de vídeos do Youtube construído desde a definição do problema, coleta, limpeza e análise …☆15Jul 6, 2023Updated 2 years ago
- ☆23Jun 30, 2024Updated last year
- Big Data Docker Data Science Spark Spark4 Hadoop HDFS Scala Python Artificial Intelligence Machine Learning Jupyter Lab Notebook☆19Updated this week
- A package to run DuckDB queries from Apache Airflow.☆21Jun 17, 2024Updated last year