☆46Jul 6, 2024Updated last year
Alternatives and similar repositories for Data-Engineering-Streaming-Project
Users that are interested in Data-Engineering-Streaming-Project are comparing it to the libraries listed below
Sorting:
- ☆10May 5, 2022Updated 3 years ago
- ☆16Sep 9, 2023Updated 2 years ago
- Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra☆145Jul 27, 2023Updated 2 years ago
- ☆30Nov 16, 2023Updated 2 years ago
- Code for the Data Engineering Zoomcamp☆20Dec 12, 2022Updated 3 years ago
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆43Sep 26, 2023Updated 2 years ago
- ☆13May 11, 2025Updated 10 months ago
- Contains spark dataframe solutions of leetcode questions☆24Dec 13, 2022Updated 3 years ago
- ☆12May 27, 2024Updated last year
- Branch Metrics Win32/C++ SDK☆10Jun 10, 2025Updated 9 months ago
- ☆21Jan 13, 2024Updated 2 years ago
- End to end data engineering project with kafka, airflow, spark, postgres and docker.☆109Jan 8, 2026Updated 2 months ago
- Repository for the Demo of using DVC with PyCaret & MLOps (DVC Office Hours - 20th Jan, 2022)☆11Jan 20, 2022Updated 4 years ago
- This repository about how to deploy machine learning model end serving with FastAPI and using MLFlow-MINIO☆18Jun 11, 2023Updated 2 years ago
- Content for a talk on "The wonderful world of data quality tools in Python"☆18May 5, 2021Updated 4 years ago
- ☆13Oct 28, 2025Updated 4 months ago
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆65Jul 21, 2023Updated 2 years ago
- Куски кода и приемы, которые часто переиспользую☆16Jan 3, 2024Updated 2 years ago
- Monotonic Optimal Binning algorithm is a statistical approach to transform continuous variables into optimal and monotonic categorical va…☆18Feb 27, 2026Updated 3 weeks ago
- A fully integrated platform for aggregating, visualising and analysing alternative data☆13Mar 15, 2024Updated 2 years ago
- Apache Flink/Apache Kafka streaming data analytics demonstration using Streaming Synthetic Sales Data Generator☆15Jun 4, 2024Updated last year
- A data engineering project with Airflow, dbt, Terrafrom, GCP and much more!☆26Nov 8, 2022Updated 3 years ago
- ☆12Mar 6, 2021Updated 5 years ago
- The goal of this project is to analyse the impact of Covid-19 on the Aviation industry through data engineering processes using technolog…☆13Jun 26, 2022Updated 3 years ago
- A repo to track data engineering projects☆13Nov 11, 2022Updated 3 years ago
- ☆19Jun 22, 2022Updated 3 years ago
- ☆16May 29, 2023Updated 2 years ago
- A portable Datamart and Business Intelligence suite built with Docker, Airflow, dbt, duckdb and Superset☆49Mar 9, 2026Updated last week
- (Python, PySpark)☆11Nov 15, 2020Updated 5 years ago
- This is the first project where we worked on apache spark, In this project what we have done is that we downloaded the datasets from KAGG…☆22Oct 14, 2021Updated 4 years ago
- Series follows learning from Apache Spark (PySpark) with quick tips and workaround for daily problems in hand☆56Sep 30, 2023Updated 2 years ago
- Sample project to demonstrate data engineering best practices☆210Feb 24, 2024Updated 2 years ago
- ☆10Jan 31, 2024Updated 2 years ago
- A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!☆864Apr 16, 2022Updated 3 years ago
- ☆146Jan 31, 2023Updated 3 years ago
- Created a Flask API🚀 which can detect toxicity in comment(text)💭 using NLP-BERT🤖. Following MLOps lifecycle🔁 to deploy ML system in p…☆14Jan 23, 2023Updated 3 years ago
- Multi-factor Risk Models of Asset or Portfolio Returns☆10May 4, 2021Updated 4 years ago
- Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.☆21Jan 30, 2019Updated 7 years ago
- ☆13Jan 7, 2022Updated 4 years ago