Renien / docker-spark-livy
Spark Standalone & Livy
☆12Updated 3 years ago
Alternatives and similar repositories for docker-spark-livy
Users that are interested in docker-spark-livy are comparing it to the libraries listed below
Sorting:
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Updated 4 years ago
- An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR☆174Updated last year
- Dockerizing and Consuming an Apache Livy environment☆12Updated 2 years ago
- Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an A…☆122Updated last week
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 2 years ago
- Base Docker image with just essentials: Hadoop, Hive and Spark.☆69Updated 4 years ago
- Simple repo to demonstrate how to submit a spark job to EMR from Airflow☆33Updated 4 years ago
- CSD for Apache Airflow☆20Updated 5 years ago
- Apche Spark Structured Streaming with Kafka using Python(PySpark)☆40Updated 6 years ago
- Docker multi-nodes Hadoop cluster with Spark 2.4.1 on Yarn☆51Updated 4 years ago
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆56Updated last year
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆98Updated 2 years ago
- PySpark data-pipeline testing and CICD☆28Updated 4 years ago
- Hadoop, Hive, Spark, Zeppelin and Livy: all in one Docker-compose file.☆164Updated 4 years ago
- DataQuality for BigData☆144Updated last year
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- Materials for the next course☆24Updated 2 years ago
- A docker using the airflow with Hadoop ecosystem (hive, spark, and sqoop)☆11Updated 4 years ago
- spark on kubernetes☆105Updated 2 years ago
- RedditR for Content Engagement and Recommendation☆22Updated 7 years ago
- Hadoop-Hive-Spark cluster + Jupyter on Docker☆74Updated 4 months ago
- ☆265Updated 6 months ago
- One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)☆120Updated 3 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆55Updated 2 years ago
- Docker with Airflow and Spark standalone cluster☆257Updated last year
- Code snippets used in demos recorded for the blog.☆37Updated 2 weeks ago
- A library that provides useful extensions to Apache Spark and PySpark.☆223Updated last month
- A provider package for kafka☆37Updated last year
- Repository used for Spark Trainings☆53Updated 2 years ago
- A boilerplate for writing PySpark Jobs☆396Updated last year