cordon-thiago / airflow-sparkView external linksLinks
Docker with Airflow and Spark standalone cluster
☆262Aug 5, 2023Updated 2 years ago
Alternatives and similar repositories for airflow-spark
Users that are interested in airflow-spark are comparing it to the libraries listed below
Sorting:
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks☆24Apr 2, 2022Updated 3 years ago
- ☆41Jan 24, 2023Updated 3 years ago
- ☆12Feb 11, 2022Updated 4 years ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Mar 29, 2021Updated 4 years ago
- Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker.☆507Nov 7, 2025Updated 3 months ago
- This is a recipe for docker container based architecture based on airflow, kafka,spark,docker☆20Oct 15, 2024Updated last year
- Building a Modern Data Lake with Minio, Spark, Airflow via Docker.☆23May 11, 2024Updated last year
- Code for Data Pipelines with Apache Airflow☆815Aug 15, 2024Updated last year
- A data pipeline moving data from a Relational database system (RDBMS) to a Hadoop file system (HDFS).☆15Jun 3, 2021Updated 4 years ago
- A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!☆830Apr 16, 2022Updated 3 years ago
- ☆16Jan 19, 2022Updated 4 years ago
- Beginner data engineering project - batch edition☆564Jan 22, 2025Updated last year
- Execution of DBT models using Apache Airflow through Docker Compose☆126Jan 3, 2023Updated 3 years ago
- Dockerizing an Apache Spark Standalone Cluster☆42Jun 29, 2022Updated 3 years ago
- Materials for the next course☆25Feb 3, 2023Updated 3 years ago
- Code for "Efficient Data Processing in Spark" Course☆361Oct 16, 2025Updated 3 months ago
- Delta-Lake, ETL, Spark, Airflow☆48Oct 9, 2022Updated 3 years ago
- A data pipeline with Kafka, Spark Streaming, dbt, Docker, Airflow, and GCP!☆12Jul 6, 2023Updated 2 years ago
- Code snippets and tools published on the blog at lifearounddata.com☆12Jan 19, 2020Updated 6 years ago
- dlt-dagster-demo☆13Nov 6, 2023Updated 2 years ago
- Repository for Data Engineering Zoomcamp 2024☆14Mar 25, 2024Updated last year
- Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow☆162Jun 16, 2020Updated 5 years ago
- Implementing best practices for PySpark ETL jobs and applications.☆2,064Jan 1, 2023Updated 3 years ago
- A production-grade data pipeline has been designed to automate the parsing of user search patterns to analyze user engagement. Extract d…☆24Nov 22, 2021Updated 4 years ago
- Provides docker-compose template for Kafka monitoring with Splunk☆14May 15, 2023Updated 2 years ago
- The training process for Credit and Risk Assessment Large Language Model (CALM)☆10Oct 15, 2023Updated 2 years ago
- Spark Standalone & Livy☆11Jul 13, 2021Updated 4 years ago
- Apache Spark docker image☆2,058Apr 21, 2023Updated 2 years ago
- A real-time financial data streaming pipeline and visualization platform using Apache Kafka, Cassandra, and Bokeh.☆16Oct 27, 2022Updated 3 years ago
- Dockerizing and Consuming an Apache Livy environment☆13Jun 29, 2022Updated 3 years ago
- ☆12Mar 17, 2022Updated 3 years ago
- Landing Page for Pycon ID 2020☆12Aug 29, 2021Updated 4 years ago
- PySpark test helper methods with beautiful error messages☆752Jan 13, 2026Updated last month
- One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)☆120Jul 20, 2021Updated 4 years ago
- Build event-trigger based automation across your SaaS tools☆17Oct 3, 2023Updated 2 years ago
- Desafio Final do Mod1 do Bootcamp EDC - v2 usando a RAIS☆20Nov 4, 2022Updated 3 years ago
- Manage Apache Atlas and Ranger configuration for your Hadoop environment.☆16May 4, 2021Updated 4 years ago
- Sample project to demonstrate data engineering best practices☆202Feb 24, 2024Updated last year
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆65Jul 21, 2023Updated 2 years ago