kasun98 / datasystemLinks
End to end data pipeline
☆19Updated 3 months ago
Alternatives and similar repositories for datasystem
Users that are interested in datasystem are comparing it to the libraries listed below
Sorting:
- Local Environment to Practice Data Engineering☆143Updated 6 months ago
- ☆33Updated last year
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆38Updated last year
- ☆21Updated 3 months ago
- Code snippets for Data Engineering Design Patterns book☆128Updated 4 months ago
- Hourly Electricity Demand Prediction Service☆108Updated last week
- Realtime Data Engineering Project☆30Updated 6 months ago
- Code for "Efficient Data Processing in Spark" Course☆325Updated 2 months ago
- Nyc_Taxi_Data_Pipeline - DE Project☆113Updated 8 months ago
- 📡 Real-time data pipeline with Kafka, Flink, Iceberg, Trino, MinIO, and Superset. Ideal for learning data systems.☆47Updated 6 months ago
- Open Source LeetCode for PySpark, Spark, Pandas and DBT/Snowflake☆198Updated 3 weeks ago
- The purpose of this project's design, development, and structure is to create an end-to-end Machine Learning Operations (MLOps) lifecycle…☆43Updated 7 months ago
- Code for the "Build Your Own Search Engine" workshop☆111Updated last month
- Sample repo for startdataengineering DE 101 free course☆68Updated last year
- Real-time fraud transaction detection system☆22Updated 10 months ago
- Start building and deploying Python packages and Docker images for MLOps tasks.☆415Updated 4 months ago
- ☆28Updated last year
- Financial Risk Assessment System☆22Updated 7 months ago
- Practical Data Engineering: A Hands-On Real-Estate Project Guide☆669Updated 10 months ago
- Data Engineering with Databricks Cookbook, published by Packt☆94Updated last year
- A turnkey MLOps pipeline demonstrating how to go from raw events to real-time predictions at scale.☆197Updated 6 months ago
- End-to-end data pipeline that ingests, processes, and stores data. It uses Apache Airflow to schedule scripts that fetch data from an API…☆20Updated 11 months ago
- Project bike sharing predictor☆79Updated 4 months ago
- ☆88Updated 10 months ago
- AWS ETL Pipleine☆30Updated last year
- Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra☆139Updated last year
- Code for blog at: https://www.startdataengineering.com/post/docker-for-de/☆38Updated last year
- Engineering Management Leadership handbook☆34Updated last year
- A template repository to create a data project with IAC, CI/CD, Data migrations, & testing☆268Updated last year
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆80Updated last year