asatrya / airflow-etl-learn
This is a simple ETL using Airflow. First, we fetch data from API (extract). Then, we drop unused columns, convert to CSV, and validate (transform). Finally, we load the transformed data to database (load).
☆23Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for airflow-etl-learn
- Data lake, data warehouse on GCP☆54Updated 2 years ago
- Code for my "Efficient Data Processing in SQL" book.☆50Updated 3 months ago
- A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.☆58Updated last year
- Code for dbt tutorial☆143Updated 5 months ago
- A repository of sample code to show data quality checking best practices using Airflow.☆72Updated last year
- Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow☆133Updated 4 years ago
- A data engineering project with Airflow, dbt, Terrafrom, GCP and much more!☆22Updated 2 years ago
- A real-time event pipeline around Kafka Ecosystem for Chicago Transit Authority.☆29Updated last year
- End to end data engineering project☆51Updated 2 years ago
- ☆86Updated 2 years ago
- This project helps me to understand the core concepts of Apache Airflow. I have created custom operators to perform tasks such as staging…☆74Updated 5 years ago
- Public source code for the Udemy online course Apache Airflow: Complete Hands-On Beginner to Advanced Class.☆62Updated 4 years ago
- A tutorial for the Great Expectations library.☆68Updated 3 years ago
- In this repository we will store all materials for workshops, courses, etc.☆36Updated this week
- ☆15Updated 9 months ago
- Processing TfL data for bike usage with Google Cloud Platform.☆42Updated 2 years ago
- Project for "Data pipeline design patterns" blog.☆41Updated 3 months ago
- Source code for the MC technical blog post "Data Observability in Practice Using SQL"☆36Updated 4 months ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- Data engineering with dbt, published by Packt☆61Updated 8 months ago
- The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such …☆116Updated 2 years ago
- Delta-Lake, ETL, Spark, Airflow☆44Updated 2 years ago
- Repo for CDC with debezium blog post☆26Updated 2 months ago
- Step-by-step tutorial on building a Kimball dimensional model with dbt☆113Updated 4 months ago
- Simple stream processing pipeline☆92Updated 5 months ago
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆62Updated 2 months ago
- Weekly Data Engineering Newsletter☆93Updated 4 months ago
- Template for Data Engineering and Data Pipeline projects☆104Updated last year
- Python ETL demo for Hackforge☆31Updated last year