☆45Jul 6, 2024Updated last year
Alternatives and similar repositories for Data-Engineering-Streaming-Project
Users that are interested in Data-Engineering-Streaming-Project are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A data pipeline with Kafka, Spark Streaming, dbt, Docker, Airflow, and GCP!☆12Jul 6, 2023Updated 2 years ago
- ☆10May 5, 2022Updated 4 years ago
- Courses and projects on Data Camp☆11Jun 28, 2020Updated 6 years ago
- Companion repository that goes along with Snowflake's "Advanced Data Engineering with Snowflake" course☆37Apr 23, 2025Updated last year
- ☆15Sep 9, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra☆146Jul 27, 2023Updated 2 years ago
- ☆30Nov 16, 2023Updated 2 years ago
- Code for the Data Engineering Zoomcamp☆20Dec 12, 2022Updated 3 years ago
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆43Sep 26, 2023Updated 2 years ago
- Contains spark dataframe solutions of leetcode questions☆24Dec 13, 2022Updated 3 years ago
- Branch Metrics Win32/C++ SDK☆10Jun 10, 2025Updated last year
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆44Dec 28, 2022Updated 3 years ago
- ☆21Jan 13, 2024Updated 2 years ago
- End to end data engineering project with kafka, airflow, spark, postgres and docker.☆115Jan 8, 2026Updated 5 months ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Example Flink and Kafka integration project☆15Nov 28, 2015Updated 10 years ago
- This repository about how to deploy machine learning model end serving with FastAPI and using MLFlow-MINIO☆18Jun 11, 2023Updated 3 years ago
- Content for a talk on "The wonderful world of data quality tools in Python"☆18May 5, 2021Updated 5 years ago
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆65Jul 21, 2023Updated 2 years ago
- Docker with Airflow and Spark standalone cluster☆265Aug 5, 2023Updated 2 years ago
- ☆16Feb 17, 2020Updated 6 years ago
- Fine tuned LLM examples running on Kubernetes☆11Oct 1, 2023Updated 2 years ago
- Monotonic Optimal Binning algorithm is a statistical approach to transform continuous variables into optimal and monotonic categorical va…☆20May 31, 2026Updated 3 weeks ago
- A fully integrated platform for aggregating, visualising and analysing alternative data☆13Mar 15, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- GitHub Action That Submits Argo Workflows For Execution on Your GKE Cluster☆16Jan 25, 2021Updated 5 years ago
- Dockerfile for OpenLogReplicator☆21Mar 3, 2026Updated 3 months ago
- ☆12Mar 6, 2021Updated 5 years ago
- A sphinx extension for adding pyscript to a page☆15Jun 22, 2026Updated last week
- The goal of this project is to analyse the impact of Covid-19 on the Aviation industry through data engineering processes using technolog…☆13Jun 26, 2022Updated 4 years ago
- Example of how to build machine learning training workflow on AWS by Prefect☆12Nov 2, 2022Updated 3 years ago
- ☆16May 29, 2023Updated 3 years ago
- Data engineering project using UK Bus Open Data Service (BODS) to calculate late buses in real-time for any selected region in England. P…☆32Apr 2, 2023Updated 3 years ago
- This project is focused on the Deployment phase of machine learning. The Docker and FastAPI are used to deploy a dockerized server of tra…☆27Jan 7, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A portable Datamart and Business Intelligence suite built with Docker, Airflow, dbt, duckdb and Superset☆49Apr 5, 2026Updated 2 months ago
- (Python, PySpark)☆11Nov 15, 2020Updated 5 years ago
- Deploy of Airflow 2.0 using ECS Fargate and AWS CDK.☆14Nov 5, 2021Updated 4 years ago
- Implement different variants of gradient descent in python using numpy☆11Apr 23, 2019Updated 7 years ago
- A Trino connector to access git repository contents☆17Feb 9, 2026Updated 4 months ago
- ☆22Jul 18, 2024Updated last year
- This is the first project where we worked on apache spark, In this project what we have done is that we downloaded the datasets from KAGG…☆23Oct 14, 2021Updated 4 years ago