This project helps me to understand the core concepts of Apache Airflow. I have created custom operators to perform tasks such as staging the data, filling the data warehouse, and running checks on the data quality as the final step. Automate the ETL pipeline and creation of data warehouse using Apache Airflow. Skills include: Using Airflow to …
☆100Aug 11, 2019Updated 6 years ago
Alternatives and similar repositories for Data-Pipelines-with-Airflow
Users that are interested in Data-Pipelines-with-Airflow are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆16Dec 4, 2017Updated 8 years ago
- Skeleton project for Apache Airflow training participants to work on.☆17Jul 9, 2020Updated 5 years ago
- Code for Data Pipelines with Apache Airflow☆814Aug 15, 2024Updated last year
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆89Nov 22, 2021Updated 4 years ago
- Beginner data engineering project - batch edition☆571Mar 12, 2026Updated 2 weeks ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.☆347Jan 12, 2022Updated 4 years ago
- Using Apache Airflow to author, run and monitor complex data pipelines.☆12Oct 24, 2018Updated 7 years ago
- Example end to end data engineering project.☆1,398Dec 8, 2022Updated 3 years ago
- A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Doc…☆23Nov 19, 2024Updated last year
- A demonstration of an ELT (Extract, Load, Transform) pipeline☆31Feb 19, 2024Updated 2 years ago
- Big Data Engineering practice project, including ETL with Airflow and Spark using AWS S3 and EMR☆89Jul 17, 2019Updated 6 years ago
- Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow☆166Jun 16, 2020Updated 5 years ago
- A production-grade data pipeline has been designed to automate the parsing of user search patterns to analyze user engagement. Extract d…☆24Nov 22, 2021Updated 4 years ago
- This is my Apache Airflow Local development setup on Windows 10 WSL2/Mac using docker-compose. It will also include some sample DAGs and …☆34Feb 9, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Solution to Data at ANZ virtual internship on Forage☆10May 30, 2021Updated 4 years ago
- Example of an ETL Pipeline using Airflow☆39Aug 30, 2017Updated 8 years ago
- ETL best practices with airflow, with examples☆1,352Sep 25, 2024Updated last year
- This project focuses on time series forecasting to predict store sales for Corporation Favorita, a large Ecuadorian-based grocery retaile…☆18Dec 4, 2023Updated 2 years ago
- DWH powered by Clickhouse and dbt☆13Aug 4, 2024Updated last year
- End to end data engineering project with kafka, airflow, spark, postgres and docker.☆110Jan 8, 2026Updated 2 months ago
- End-to-end data pipeline that ingests, processes, and stores data. It uses Apache Airflow to schedule scripts that fetch data from an API…☆21Jul 26, 2024Updated last year
- This repo is for the Linkedin Learning course: End-to-End Data Engineering Project☆31Nov 9, 2023Updated 2 years ago
- streaming eight subreddits from reddit api using kafka producer & spark structured streaming.☆19Mar 18, 2026Updated last week
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Personal Data Engineering Projects☆1,001Feb 8, 2023Updated 3 years ago
- ☆31Jul 7, 2023Updated 2 years ago
- A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!☆865Apr 16, 2022Updated 3 years ago
- Project - Data Processing and Analysis in Python Course☆39Oct 10, 2018Updated 7 years ago
- Built a Data Pipeline for a Retail store using AWS services that collects data from its transactional database (OLTP) in Snowflake and tr…☆12May 25, 2023Updated 2 years ago
- Apache Airflow tutorial☆973Nov 3, 2022Updated 3 years ago
- An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.☆1,497Mar 9, 2020Updated 6 years ago
- In this project I used apache airflow to scrape website periodically. This is for the tutorials I do on youtube.☆10Nov 21, 2022Updated 3 years ago
- ☆11Jan 20, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆44Apr 21, 2022Updated 3 years ago
- Implementing best practices for PySpark ETL jobs and applications.☆2,086Jan 1, 2023Updated 3 years ago
- Get started setting up infrastructure as code on Google Cloud Platform☆11Jun 13, 2021Updated 4 years ago
- Resources for the free AWS Data Engineering course on youtube☆104Aug 30, 2021Updated 4 years ago
- Amazon Redshift offers a common query interface against data stored in fast, local storage as well as data from high-capacity, inexpensiv…☆13Nov 26, 2018Updated 7 years ago
- Resources and projects from Udacity Data Engineering with AWS nano degree programme☆28Apr 12, 2023Updated 2 years ago
- Simple stream processing pipeline☆112Jun 17, 2024Updated last year