Wittline / docker-livy
Dockerizing and Consuming an Apache Livy environment
☆11Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for docker-livy
- Simple repo to demonstrate how to submit a spark job to EMR from Airflow☆32Updated 4 years ago
- ☆86Updated 2 years ago
- 😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS☆43Updated 5 years ago
- Simple stream processing pipeline☆91Updated 4 months ago
- Ultimate guide for mastering Spark Performance Tuning and Optimization concepts and for preparing for Data Engineering interviews☆67Updated 5 months ago
- Apache Spark 3 - Structured Streaming Course Material☆119Updated last year
- ☆25Updated last year
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆24Updated 7 months ago
- Docker with Airflow and Spark standalone cluster☆244Updated last year
- Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow☆133Updated 4 years ago
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 2 years ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Updated 2 years ago
- This repo contains commands that data engineers use in day to day work.☆59Updated last year
- Ravi Azure ADB ADF Repository☆64Updated 6 months ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- Spark all the ETL Pipelines☆32Updated last year
- Sample project to demonstrate data engineering best practices☆164Updated 8 months ago
- Spark data pipeline that processes movie ratings data.☆27Updated 3 weeks ago
- Resources for video demonstrations and blog posts related to DataOps on AWS☆170Updated 2 years ago
- The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such …☆116Updated 2 years ago
- ☆8Updated last month
- Big Data Engineering practice project, including ETL with Airflow and Spark using AWS S3 and EMR☆80Updated 5 years ago
- Apartments Data Pipeline using Airflow and Spark.☆18Updated 2 years ago
- Demo DAGs that show how to run dbt Core in Airflow using Cosmos☆37Updated last month
- Code for my "Efficient Data Processing in SQL" book.☆49Updated 3 months ago
- Delta-Lake, ETL, Spark, Airflow☆44Updated 2 years ago
- Solution to all projects of Udacity's Data Engineering Nanodegree: Data Modeling with Postgres & Cassandra, Data Warehouse with Redshift,…☆56Updated 2 years ago
- Spark development environment for kubernetes, spark-submit and jupyter notebook☆19Updated 2 years ago
- PySpark Cheatsheet☆35Updated last year
- Delta Lake Documentation☆46Updated 4 months ago