tuanchris / cloud-data-lake
Data lake, data warehouse on GCP
☆54Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for cloud-data-lake
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆166Updated last year
- Execution of DBT models using Apache Airflow through Docker Compose☆113Updated last year
- Cloned by the `dbt init` task☆59Updated 6 months ago
- Data pipeline with dbt, Airflow, Great Expectations☆158Updated 3 years ago
- Code for dbt tutorial☆143Updated 5 months ago
- ☆38Updated 3 years ago
- Step-by-step tutorial on building a Kimball dimensional model with dbt☆111Updated 4 months ago
- A repository of sample code to show data quality checking best practices using Airflow.☆72Updated last year
- A repository of sample code to accompany our blog post on Airflow and dbt.☆167Updated last year
- (project & tutorial) dag pipeline tests + ci/cd setup☆85Updated 3 years ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆89Updated 2 years ago
- ☆66Updated last month
- An example dbt project using AutomateDV to create a Data Vault 2.0 Data Warehouse based on the Snowflake TPC-H dataset.☆41Updated 7 months ago
- Containerized end-to-end analytics of Spotify data using Python, dbt, Postgres, and Metabase☆123Updated 2 years ago
- build dw with dbt☆29Updated 3 weeks ago
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆24Updated last year
- Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow☆133Updated 4 years ago
- Great Expectations Airflow operator☆159Updated 3 weeks ago
- A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.☆180Updated last year
- This repo helps bootstrap the infrastructures with a modern data stack on Google Cloud Platform using Terraform.☆115Updated 2 years ago
- Code for my "Efficient Data Processing in SQL" book.☆50Updated 3 months ago
- Simple stream processing pipeline☆92Updated 5 months ago
- ☆85Updated 2 years ago
- ☆52Updated last year
- Public source code for the Udemy online course Apache Airflow: Complete Hands-On Beginner to Advanced Class.☆62Updated 4 years ago
- Airflow training for the crunch conf☆105Updated 6 years ago
- Project for "Data pipeline design patterns" blog.☆41Updated 3 months ago
- A curated collection of publicly available resources on dbt best practices and how data-driven organizations around the world utilize dbt☆112Updated 2 years ago
- Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service☆63Updated 6 months ago
- ☆34Updated last month