anilkulkarni87 / airflow-docker
This is my Apache Airflow Local development setup on Windows 10 WSL2/Mac using docker-compose. It will also include some sample DAGs and workflows.
☆27Updated last year
Alternatives and similar repositories for airflow-docker:
Users that are interested in airflow-docker are comparing it to the libraries listed below
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆29Updated last year
- The Data Pipeline and Analytics Stack is a comprehensive solution designed for processing, storing, and visualizing data. Explore a compl…☆11Updated last year
- ☆11Updated 2 years ago
- Delta-Lake, ETL, Spark, Airflow☆46Updated 2 years ago
- ☆37Updated 5 years ago
- Kafka variant of the MLOps Level 1 stack☆24Updated 2 years ago
- Full stack data engineering tools and infrastructure set-up☆49Updated 4 years ago
- pyspark dataframe made easy☆16Updated 3 years ago
- build dw with dbt☆36Updated 3 months ago
- Project for real-time anomaly detection using Kafka and python☆59Updated 2 years ago
- Data pipeline that scrapes Rust cheater Steam profiles☆52Updated 3 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- Repo that relates to the Medium blog 'Keeping your ML model in shape with Kafka, Airflow' and MLFlow'☆119Updated last year
- ☆13Updated 2 years ago
- The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on…☆26Updated 2 years ago
- End-to-end data platform leveraging the Modern data stack☆46Updated 10 months ago
- ☆17Updated 6 months ago
- A course by DataTalks Club that covers Spark, Kafka, Docker, Airflow, Terraform, DBT, Big Query etc☆14Updated 2 years ago
- PySpark Cheatsheet☆36Updated 2 years ago
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks☆22Updated 2 years ago
- Bunch of Airflow Configurations and DAGs for Kubernetes, Spark based data-pipelines. Scale inside Kubernetes using spark kubernetes maste…☆23Updated 2 years ago
- dagster scikit-learn pipeline example.☆44Updated last year
- PipeRider dbt workshop for DataTalksClub DE Zoomcamp☆17Updated last year
- Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as …☆16Updated 5 years ago
- Execution of DBT models using Apache Airflow through Docker Compose☆114Updated 2 years ago
- reating a modern data pipeline using a combination of Terraform, AWS Lambda and S3, Snowflake, DBT, Mage AI, and Dash.☆14Updated last year
- A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in …☆21Updated 2 years ago
- Pyspark boilerplate for running prod ready data pipeline☆28Updated 3 years ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆39Updated 3 years ago
- Data engineering interviews Q&A for data community by data community☆63Updated 4 years ago