brfulu / airflow-data-pipeline
Udacity Data Engineer Nanodegree - Airflow data pipeline
☆10Updated 5 years ago
Alternatives and similar repositories for airflow-data-pipeline:
Users that are interested in airflow-data-pipeline are comparing it to the libraries listed below
- Udacity Data Engineer Nanodegree - Capstone project☆11Updated 5 years ago
- dbt (data build tool) projects targeting AWS analytics services (redshift, glue, emr, athena) and open table formats☆29Updated last year
- Repository used for Spark Trainings☆53Updated last year
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆47Updated last month
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆40Updated 2 years ago
- My solutions for the Udacity Data Engineering Nanodegree☆33Updated 5 years ago
- Airflow training for the crunch conf☆105Updated 6 years ago
- Example orchestration pipeline for Fivetran + dbt managed by Airflow☆21Updated 4 years ago
- This is the documentation for the Amazon Redshift Developer Guide☆121Updated last year
- Udacity Data Engineer Nano Degree - Project-3 (Data Warehouse)☆22Updated 5 years ago
- Batch Processing , orchestration using Apache Airflow and Google Workflows, spark structured Streaming and a lot more☆19Updated 2 years ago
- Repo for holding the dbt project used to make sense of cloud cost data from the major cloud platforms☆37Updated 4 years ago
- Various data stream/batch process demo with Apache Scala Spark 🚀☆11Updated 4 years ago
- Pyspark boilerplate for running prod ready data pipeline☆28Updated 3 years ago
- How to Automate SQL: dbt(data build tool) tutorial on bigquery with extensive NOTES☆31Updated last year
- (project & tutorial) dag pipeline tests + ci/cd setup☆86Updated 4 years ago
- Utility functions for dbt projects running on Spark☆31Updated last week
- Source code for the MC technical blog post "Data Observability in Practice Using SQL"☆36Updated 7 months ago
- Glue VSCode devcontainer setup☆14Updated 2 years ago
- Sentiment Analysis of a Twitter Topic with Spark Structured Streaming☆55Updated 6 years ago
- AWS Big Data Certification☆25Updated last month
- A repository of sample code to show data quality checking best practices using Airflow.☆74Updated last year
- Project files for the post: Running PySpark Applications on Amazon EMR: Methods for Interacting with PySpark on Amazon Elastic MapReduce.☆38Updated 2 years ago
- Creates simple data models on Snowflake to report dbt source freshness and tests☆23Updated last year
- This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…☆76Updated 6 years ago
- ☆74Updated 4 months ago
- Airflow ETL for Meetup API☆46Updated 6 years ago
- A way for home buyers to know about factors affecting a state☆47Updated 5 years ago
- Redshift package for dbt (getdbt.com)☆101Updated 3 weeks ago
- This repository contains the dbt-glue adapter☆108Updated last week