End-to-end data pipeline that ingests, processes, and stores data. It uses Apache Airflow to schedule scripts that fetch data from an API, sends the data to Kafka, and processes it with Spark before writing to Cassandra. The pipeline, built with Python and Apache Zookeeper, is containerized with Docker for easy deployment and scalability.
☆21Jul 26, 2024Updated last year
Alternatives and similar repositories for e2e-structured-streaming
Users that are interested in e2e-structured-streaming are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆48Dec 11, 2023Updated 2 years ago
- Apache Airflow advanced functionalities examples☆21Mar 22, 2024Updated 2 years ago
- End-to-End BI & DW project: Data Warehousing design and modeling (MySQL), ETL (PDI) and Dashboard (Tableau)☆17Aug 10, 2020Updated 5 years ago
- Open Data Stack Platform: a collection of projects and pipelines built with open data stack tools for scalable, observable data platform…☆22May 11, 2026Updated last week
- used Airflow, Postgres, Kafka, Spark, and Cassandra, and GitHub Actions to establish an end-to-end data pipeline☆32Oct 25, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- This is a demo project to compare two web scrapping frameworks, Playwright and Selenium and using the new Pipelining tool Dagster☆15Sep 9, 2021Updated 4 years ago
- ☆12Mar 6, 2021Updated 5 years ago
- A demonstration of an ELT (Extract, Load, Transform) pipeline☆31Feb 19, 2024Updated 2 years ago
- ☆13Sep 15, 2024Updated last year
- A curated list of awesome Python frameworks, libraries, software and resources☆15Jun 6, 2018Updated 7 years ago
- Upload shots to dribbble.com☆14Mar 27, 2012Updated 14 years ago
- Đồ án tốt nghiệp | Data Lakehouse☆43Feb 9, 2026Updated 3 months ago
- A testing ground for Plotly Dash app development including app features and experimenting with dashboard visualizations.☆10Oct 15, 2023Updated 2 years ago
- ☆10Feb 2, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆10Jul 19, 2020Updated 5 years ago
- Glue ETL job or EMR Spark that gets from data catalog, modifies and uploads to S3 and Data Catalog☆13Aug 26, 2023Updated 2 years ago
- It is a assemble to include all Practice Projects about Big Data Topic, includes Hadoop, Spark, Spark Streaming and Kafka☆11Mar 7, 2019Updated 7 years ago
- A platform that helps developers to better understand CSS through declaration interpretation and may even improve them through suggestion…☆14Jul 3, 2021Updated 4 years ago
- Modern GIS Web Client for JavaScript, based on MapboxGL-JS, OpenLayers, Leaflet☆13Sep 16, 2022Updated 3 years ago
- ☆23Jul 8, 2025Updated 10 months ago
- TTS utility☆12Aug 2, 2020Updated 5 years ago
- End to end data engineering project with kafka, airflow, spark, postgres and docker.☆113Jan 8, 2026Updated 4 months ago
- um its my portfolio?☆16Feb 10, 2026Updated 3 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆16Feb 11, 2026Updated 3 months ago
- Spark Notebook docker image☆10Dec 29, 2017Updated 8 years ago
- View data on a tile38 server☆14Aug 18, 2024Updated last year
- ☆16Nov 27, 2025Updated 5 months ago
- End-to-end data platform: A PoC Data Platform project utilizing modern data stack (Spark, Airflow, DBT, Trino, Lightdash, Hive metastore,…☆48Oct 14, 2024Updated last year
- ☆11Jan 31, 2019Updated 7 years ago
- [SC2023] POMELO: Fine-grained Population Mapping from Coarse Census Counts and Open Geodata☆13Aug 5, 2024Updated last year
- This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project de…☆12Nov 18, 2023Updated 2 years ago
- ☆22Mar 15, 2011Updated 15 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 🚀 A simple javascript template for rapid development of GitHub actions.☆17Feb 24, 2023Updated 3 years ago
- ☆27Aug 28, 2023Updated 2 years ago
- DuckDB Copilot Extension☆10Jan 12, 2026Updated 4 months ago
- Transformer Conformal Prediction for Time Series☆18Apr 13, 2026Updated last month
- Create and Run 🚀 Dotfiles projects for Windows 10/11☆23Jan 26, 2025Updated last year
- Stock Advisor☆12Jun 13, 2025Updated 11 months ago
- ☆11Feb 7, 2024Updated 2 years ago