End to end data engineering project with kafka, airflow, spark, postgres and docker.
☆108Jan 8, 2026Updated last month
Alternatives and similar repositories for data-engineering-project
Users that are interested in data-engineering-project are comparing it to the libraries listed below
Sorting:
- Simple ETL pipeline using Python☆29May 22, 2023Updated 2 years ago
- End-to-end data pipeline that ingests, processes, and stores data. It uses Apache Airflow to schedule scripts that fetch data from an API…☆21Jul 26, 2024Updated last year
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆45Dec 11, 2023Updated 2 years ago
- ☆12Sep 23, 2023Updated 2 years ago
- ☆46Jul 6, 2024Updated last year
- Example end to end data engineering project.☆1,387Dec 8, 2022Updated 3 years ago
- Practical Data Engineering: A Hands-On Real-Estate Project Guide☆774Sep 3, 2024Updated last year
- A template repository to create a data project with IAC, CI/CD, Data migrations, & testing☆290Jul 11, 2024Updated last year
- Open Data Stack Platform: a collection of projects and pipelines built with open data stack tools for scalable, observable data platform…☆22Dec 21, 2025Updated 2 months ago
- YouTube tutorial project☆108Oct 17, 2023Updated 2 years ago
- Cool DE Projects☆64Dec 23, 2025Updated 2 months ago
- Personal Data Engineering Projects☆993Feb 8, 2023Updated 3 years ago
- Statistical modeling and Bayesian modeling by PyMC3, Stan and TensorFlow Probability☆23Dec 31, 2019Updated 6 years ago
- ☆21May 13, 2025Updated 9 months ago
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆43Sep 26, 2023Updated 2 years ago
- ☆28Jun 21, 2024Updated last year
- Code for "Efficient Data Processing in Spark" Course☆366Oct 16, 2025Updated 4 months ago
- Data Engineering YouTube Analysis Project by Darshil Parmar☆229Dec 8, 2023Updated 2 years ago
- An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Ka…☆316Feb 14, 2025Updated last year
- A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation an…☆23Nov 21, 2023Updated 2 years ago
- Big Data Engineering practice project, including ETL with Airflow and Spark using AWS S3 and EMR☆90Jul 17, 2019Updated 6 years ago
- Beginner data engineering project - batch edition☆565Jan 22, 2025Updated last year
- Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra☆144Jul 27, 2023Updated 2 years ago
- Data Engineering Capstone Project: ETL Pipelines and Data Warehouse Development☆21Jul 9, 2019Updated 6 years ago
- System Design, Solution Architecture, Data Systems Practice☆70Aug 14, 2025Updated 6 months ago
- Curated List of NLP tutorials☆30Feb 27, 2025Updated last year
- tokyo-olympic-azure-data-engineering-project☆221Jul 16, 2024Updated last year
- This repo is for the Linkedin Learning course: End-to-End Data Engineering Project☆29Nov 9, 2023Updated 2 years ago
- A hands on advanced RAG tutorials☆30Apr 10, 2025Updated 10 months ago
- A end-to-end real-time stock market data pipeline with Python, AWS EC2, Apache Kafka, and Cassandra Data is processed on AWS EC2 with Apa…☆27Jun 7, 2023Updated 2 years ago
- Data Engineering Project: Extracting music video metrics of Twice using YouTube API, AWS, and Tableau☆32Nov 21, 2023Updated 2 years ago
- A demonstration of an ELT (Extract, Load, Transform) pipeline☆31Feb 19, 2024Updated 2 years ago
- Github Action to upload iOS dSYM files using datadog-ci tool.☆11Feb 16, 2026Updated 2 weeks ago
- A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!☆842Apr 16, 2022Updated 3 years ago
- MLOps for deploying a Credit Risk model☆35Jun 21, 2023Updated 2 years ago
- Chatbot to interact with a SQL database using LLMs and Langchain agents☆30Feb 19, 2024Updated 2 years ago
- Spark data pipeline that processes movie ratings data.☆31Updated this week
- Final Project for Data Engineering Zoomcamp Course 2024 🧙🔥☆11Apr 17, 2024Updated last year
- Abnormal Activity Detection using Deep Learning LRCN is a model that combines CNN and RNN to identify abnormal behavior in videos. With r…☆10Sep 22, 2023Updated 2 years ago