karlchris / data-engineering
Data Engineering Handbook for beginners and everyone
☆28Updated 2 months ago
Related projects: ⓘ
- where geeks hangout and discuss about data engineering☆39Updated last year
- A template repository to create a data project with IAC, CI/CD, Data migrations, & testing☆225Updated 2 months ago
- Sample project to demonstrate data engineering best practices☆156Updated 6 months ago
- Data Engineering examples covering Airflow and Mage for workflows; dbt for BigQuery, Redshift, ClickHouse; Spark and Kafka for Batch/Stre…☆50Updated 3 weeks ago
- Code for dbt tutorial☆138Updated 3 months ago
- ☆13Updated last year
- A curated list of big data engineering tools, resources and communities.☆30Updated 4 years ago
- Simple stream processing pipeline☆89Updated 3 months ago
- Code for "Efficient Data Processing in Spark" Course☆212Updated 3 months ago
- This is a template you can use for your next data engineering portfolio project.☆152Updated 3 years ago
- velib-v2___an ETL pipeline that employs batch and streaming jobs using spark, kafka, airflow, and other tools☆17Updated last week
- End to end data engineering project☆49Updated last year
- This repository will contain all of the resources for the Mage component of the Data Engineering Zoomcamp: https://github.com/DataTalksCl…☆95Updated last month
- This is project documentation templates derived from CRISP-DM to be used for Data Engineering projects.☆38Updated 3 years ago
- Data Engineering with Google Cloud Platform, published by Packt☆108Updated last year
- ☆17Updated 2 months ago
- Spark all the ETL Pipelines☆29Updated last year
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆47Updated 3 months ago
- Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow☆127Updated 4 years ago
- Data Pipeline from the Global Historical Climatology Network DataSet☆24Updated last year
- Collection for your Data Science Learning☆51Updated last week
- This repo contains a spark standalone cluster on docker for anyone who wants to play with PySpark by submitting their applications.☆22Updated last year
- Nyc_Taxi_Data_Pipeline - DE Project☆62Updated last month
- ☆41Updated 3 years ago
- ☆8Updated 5 months ago
- ☆35Updated 2 months ago
- Django-based course management platform for Zoomcamps☆49Updated this week
- A curated list of awesome public DBT projects☆82Updated 8 months ago
- Code for "Advanced data transformations in SQL" free live workshop☆54Updated last month
- DataTalks Workshop Materials☆18Updated 6 months ago