martinkarlssonio / big-data-solution
☆45Updated last year
Alternatives and similar repositories for big-data-solution:
Users that are interested in big-data-solution are comparing it to the libraries listed below
- This repo contains a spark standalone cluster on docker for anyone who wants to play with PySpark by submitting their applications.☆28Updated last year
- Docker with Airflow and Spark standalone cluster☆247Updated last year
- Multi-container environment with Hadoop, Spark and Hive☆204Updated last year
- Hadoop-Hive-Spark cluster + Jupyter on Docker☆65Updated 3 weeks ago
- Near real time ETL to populate a dashboard.☆72Updated 7 months ago
- Simple repo to demonstrate how to submit a spark job to EMR from Airflow☆32Updated 4 years ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆39Updated 3 years ago
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks☆21Updated 2 years ago
- Simple stream processing pipeline