yahwang / Awesome-Data-EngineeringLinks
๐(GitBook) A curated list of awesome Data Engineering resources
โ37Updated last month
Alternatives and similar repositories for Awesome-Data-Engineering
Users that are interested in Awesome-Data-Engineering are comparing it to the libraries listed below
Sorting:
- Awesome list for datapipelineโ35Updated 2 years ago
- โ83Updated 2 years ago
- โ28Updated 3 years ago
- ๋ฐ์ดํฐ ์์ง๋์ด ๊ธฐ์ ์ ๋ฆฌโ18Updated last year
- ๋น ๋ฐ์ดํฐ pipeline ๊ตฌ์ฑ ์์ ๊ธฐ์ ๋ค์ ๊ดํ coding ์ค์ต ๋ฐ ์ฐ๊ตฌโ41Updated 5 years ago
- Stream smartphone data with FastAPI, Kafka, QuestDB, and Docker.โ26Updated 2 years ago
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)โ61Updated 2 years ago
- โ114Updated 2 years ago
- A real-time event pipeline around Kafka Ecosystem for Chicago Transit Authority.โ31Updated 2 years ago
- A curated list of awesome DataOps toolsโ204Updated 2 months ago
- โ48Updated 3 years ago
- DataOps(Data Operation), MLOps(Machine Learning Operation) Contentsโ131Updated 4 years ago
- This is a repo with links to everything you'd ever want to learn about data engineeringโ10Updated 10 months ago
- Data engineering interviews Q&A for data community by data communityโ64Updated 5 years ago
- Code snippets for Data Engineering Design Patterns bookโ207Updated 6 months ago
- โ44Updated last year
- This project shows how to serve an TF based image classification model as a web service with TFServing, Docker, and Kubernetes(GKE).โ125Updated 3 years ago
- A Series of Notebooks on how to start with Kafka and Pythonโ152Updated 7 months ago
- โ14Updated 11 months ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,โฆโ90Updated 3 years ago
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIOโ63Updated 2 years ago
- Apache Airflow Best Practices, published by Packtโ49Updated 11 months ago
- Gitbook Repo for Practical Data Pipelineโ25Updated 3 years ago
- Data lake, data warehouse on GCPโ56Updated 3 years ago
- Public source code for the Batch Processing with Apache Beam (Python) online courseโ18Updated 5 years ago
- Full stack data engineering tools and infrastructure set-upโ56Updated 4 years ago
- Repo that relates to the Medium blog 'Keeping your ML model in shape with Kafka, Airflow' andย MLFlow'โ121Updated 2 years ago
- The Python fake data producer for Apache Kafkaยฎ is a complete demo app allowing you to quickly produce JSON fake streaming datasets and โฆโ85Updated last year
- A Snowflake GPT Demo using SqlAlchemyโ23Updated 2 years ago
- (project & tutorial) dag pipeline tests + ci/cd setupโ88Updated 4 years ago