EcZachly / microbatch-hourly-deduped-tutorial
☆102Updated 2 years ago
Alternatives and similar repositories for microbatch-hourly-deduped-tutorial:
Users that are interested in microbatch-hourly-deduped-tutorial are comparing it to the libraries listed below
- Sample project to demonstrate data engineering best practices☆179Updated last year
- This repository goes over how to handle massive variety in data engineering☆225Updated 2 years ago
- A template repository to create a data project with IAC, CI/CD, Data migrations, & testing☆256Updated 7 months ago
- Code for dbt tutorial☆151Updated 9 months ago
- Template for Data Engineering and Data Pipeline projects☆107Updated 2 years ago
- Simple stream processing pipeline☆99Updated 8 months ago
- Code for my "Efficient Data Processing in SQL" book.☆56Updated 6 months ago
- Code for "Advanced data transformations in SQL" free live workshop☆72Updated 4 months ago
- Local Environment to Practice Data Engineering☆142Updated 2 months ago
- Project for "Data pipeline design patterns" blog.☆44Updated 6 months ago
- Code for "Efficient Data Processing in Spark" Course☆281Updated 5 months ago
- In this repository we store all materials for dlt workshops, courses, etc.☆115Updated 2 months ago
- Hey this is the repo that has all the queries and data for my video game training series!☆143Updated 2 years ago
- ☆122Updated 2 weeks ago
- Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consu…☆67Updated last year
- Code snippets for Data Engineering Design Patterns book☆73Updated 3 weeks ago
- ☆177Updated 4 years ago
- End to end data engineering project☆53Updated 2 years ago
- Step-by-step tutorial on building a Kimball dimensional model with dbt☆127Updated 7 months ago
- Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard. The dashboa…☆220Updated 2 years ago
- ☆111Updated 7 months ago
- Ultimate guide for mastering Spark Performance Tuning and Optimization concepts and for preparing for Data Engineering interviews☆109Updated 9 months ago
- Tracking and measuring neighborhood and district-level eviction rates in the city of San Francisco.☆139Updated 4 years ago
- Open Source LeetCode for PySpark, Spark, Pandas and DBT/Snowflake☆154Updated 3 weeks ago
- ☆22Updated 11 months ago
- ☆149Updated 2 years ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆234Updated 3 weeks ago
- Data pipeline that scrapes Rust cheater Steam profiles☆52Updated 3 years ago
- This repo contains "Databricks Certified Data Engineer Professional" Questions and related docs.☆60Updated 6 months ago
- Just starting your DE journey or along the way already?. I will be sharing a short list of DATA-ENGINEERING-CENTRED books that covers the…☆34Updated 2 years ago