anilkulkarni87 / airflow-docker
This is my Apache Airflow Local development setup on Windows 10 WSL2/Mac using docker-compose. It will also include some sample DAGs and workflows.
☆28Updated last year
Alternatives and similar repositories for airflow-docker:
Users that are interested in airflow-docker are comparing it to the libraries listed below
- Full stack data engineering tools and infrastructure set-up☆50Updated 4 years ago
- Delta-Lake, ETL, Spark, Airflow☆46Updated 2 years ago
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆29Updated last year
- The Data Pipeline and Analytics Stack is a comprehensive solution designed for processing, storing, and visualizing data. Explore a compl…☆13Updated last year
- Kafka variant of the MLOps Level 1 stack☆24Updated 2 years ago
- A portable Datamart and Business Intelligence suite built with Docker, Airflow, dbt, PostgreSQL and Superset☆39Updated 4 months ago
- The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on…☆27Updated 2 years ago
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks☆23Updated 2 years ago
- ☆10Updated 2 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- A Postgres data warehouse for processing synthetic data using IAC principles☆16Updated 2 years ago
- Challenge Data Engineer☆25Updated 2 years ago
- Design/Implement stream/batch architecture on NYC taxi data | #DE☆25Updated 3 years ago
- PySpark Tutorial for Beginners on Google Colab: Hands-On Guide☆16Updated 4 years ago
- Building a Data Pipeline with an Open Source Stack☆50Updated 9 months ago
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆45Updated last year
- build dw with dbt☆43Updated 5 months ago
- Multi-docker container data science / engineering playground (w/ Kafka, Airflow, MLFlow, Tensorflow-Keras / SKLearn) for simulating a mic…☆11Updated last year
- A repository of sample code to show data quality checking best practices using Airflow.☆74Updated 2 years ago
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆30Updated 4 years ago
- Finance 🏦 Data Builder 🛠️ @ postgres 🐘☆21Updated 4 years ago
- A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.☆66Updated last year
- Analytics engineering with dbt - projects and developer environment☆18Updated 6 months ago
- New generation opensource data stack☆65Updated 2 years ago
- Template for data pipelines, ML workflows, API dev and monitoring☆45Updated last year
- PyConDE & PyData Berlin 2019 Airflow Workshop: Airflow for machine learning pipelines.☆47Updated last year
- Docker compose and Google Colab demo to build a CDC with Delta Lake☆15Updated 2 years ago
- A data pipeline moving data from a Relational database system (RDBMS) to a Hadoop file system (HDFS).☆15Updated 3 years ago
- Cost Efficient Data Pipelines with DuckDB☆51Updated 8 months ago
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 2 years ago