yahwang / Awesome-Data-EngineeringLinks
π(GitBook) A curated list of awesome Data Engineering resources
β36Updated last week
Alternatives and similar repositories for Awesome-Data-Engineering
Users that are interested in Awesome-Data-Engineering are comparing it to the libraries listed below
Sorting:
- β83Updated 2 years ago
- Awesome list for datapipelineβ35Updated 2 years ago
- λ°μ΄ν° μμ§λμ΄ κΈ°μ μ 리β18Updated last year
- λΉ λ°μ΄ν° pipeline κ΅¬μ± μμ κΈ°μ λ€μ κ΄ν coding μ€μ΅ λ° μ°κ΅¬β41Updated 5 years ago
- β28Updated 2 years ago
- β112Updated 2 years ago
- A real-time event pipeline around Kafka Ecosystem for Chicago Transit Authority.β31Updated last year
- data engineer advanced training courseβ9Updated 3 months ago
- Stream smartphone data with FastAPI, Kafka, QuestDB, and Docker.β26Updated last year
- Full stack data engineering tools and infrastructure set-upβ55Updated 4 years ago
- Data engineering interviews Q&A for data community by data communityβ64Updated 5 years ago
- This project shows how to serve an TF based image classification model as a web service with TFServing, Docker, and Kubernetes(GKE).β125Updated 3 years ago
- Kafka Connect connector that reads JSON data from Apache Kafka and send JSON record to Another Kafka topic.β51Updated last year
- DataOps(Data Operation), MLOps(Machine Learning Operation) Contentsβ131Updated 4 years ago
- Spark 곡μ λ¬Έμ νκ΅μ΄ν λ²μβ16Updated 3 years ago
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)β59Updated last year
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,β¦β90Updated 3 years ago
- Dockerizing an Apache Spark Standalone Clusterβ43Updated 3 years ago
- Code snippets for Data Engineering Design Patterns bookβ142Updated 4 months ago
- β42Updated last year
- β48Updated 3 years ago
- Apache Airflow Best Practices, published by Packtβ45Updated 9 months ago
- Gitbook Repo for Practical Data Pipelineβ25Updated 3 years ago
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIOβ63Updated 2 years ago
- (project & tutorial) dag pipeline tests + ci/cd setupβ88Updated 4 years ago
- Repo for everything open table formats (Iceberg, Hudi, Delta Lake) and the overall Lakehouse architectureβ92Updated last month
- Weekly Data Engineering Newsletterβ96Updated last year
- DEμ§λ¬΄μ νμν λͺ¨λ κ²β204Updated 2 months ago
- β20Updated 4 years ago
- Mastering Big Data Analytics with PySpark, Published by Packtβ160Updated 11 months ago