Pathairush / airflow_hive_spark_sqoopLinks
A docker using the airflow with Hadoop ecosystem (hive, spark, and sqoop)
☆13Updated 4 years ago
Alternatives and similar repositories for airflow_hive_spark_sqoop
Users that are interested in airflow_hive_spark_sqoop are comparing it to the libraries listed below
Sorting:
- Hadoop-Hive-Spark cluster + Jupyter on Docker☆83Updated last year
- Docker with Airflow and Spark standalone cluster☆262Updated 2 years ago
- This project provides Apache Spark SQL, RDD, DataFrame and Dataset examples in Scala language☆568Updated last year
- Simple repo to demonstrate how to submit a spark job to EMR from Airflow☆34Updated 5 years ago
- Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker.☆505Updated 2 months ago
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆49Updated 2 years ago
- Life-cycle: Internal working of HDFS, SQOOP, HIVE, SPARK, HBASE, KAFKA with code.☆15Updated 6 years ago
- ☆269Updated last year
- The Internals of Spark SQL