suhothayan / hadoop-spark-pig-hive
Docker with hadoop spark pig hive
☆24Updated 5 years ago
Alternatives and similar repositories for hadoop-spark-pig-hive:
Users that are interested in hadoop-spark-pig-hive are comparing it to the libraries listed below
- Hadoop, Hive, Spark, Zeppelin and Livy: all in one Docker-compose file.☆161Updated 3 years ago
- Base Docker image with just essentials: Hadoop, Hive and Spark.☆68Updated 3 years ago
- A docker using the airflow with Hadoop ecosystem (hive, spark, and sqoop)☆11Updated 3 years ago
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 2 years ago
- Apche Spark Structured Streaming with Kafka using Python(PySpark)☆41Updated 5 years ago
- Hadoop-Hive-Spark cluster + Jupyter on Docker☆63Updated 2 weeks ago
- ☆24Updated 3 years ago
- Apache Spark 3 - Structured Streaming Course Material☆121Updated last year
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆39Updated 3 years ago
- Apache Spark 3 - Structured Streaming Course Material☆44Updated 4 years ago
- Docker-compose contains the most common big data systems like: Apache Hadoop, Apache Hive, Apache Spark, Jupyter, Flink☆27Updated last year
- Hadoop, Hive, Parquet and Hue in docker-compose v3☆42Updated 4 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- Multi-container environment with Hadoop, Spark and Hive☆204Updated last year
- A Spark cluster setup running on Docker containers☆60Updated 5 years ago
- Docker Big Data Tools: This docker-compose file is configured to run multiple nodes. This is a Hadoop Cluster that contains the necessary…☆28Updated 3 years ago
- Spark and Hive docker containers sharing a common MySQL metastore☆26Updated 4 years ago
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆29Updated 4 years ago
- Apache Flink docker image☆192Updated 2 years ago
- An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR☆174Updated last year
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- Code examples on Apache Spark using python☆106Updated 2 years ago
- This repository contains code for Spark Streaming☆21Updated 3 years ago
- Repository used for Spark Trainings☆53Updated last year
- ☆32Updated 6 years ago
- Educational notes,Hands on problems w/ solutions for hadoop ecosystem☆86Updated 5 years ago
- Series follows learning from Apache Spark (PySpark) with quick tips and workaround for daily problems in hand☆45Updated last year
- ☆91Updated 2 years ago
- Docker with Airflow and Spark standalone cluster☆247Updated last year