End-to-end data pipeline that ingests, processes, and stores data. It uses Apache Airflow to schedule scripts that fetch data from an API, sends the data to Kafka, and processes it with Spark before writing to Cassandra. The pipeline, built with Python and Apache Zookeeper, is containerized with Docker for easy deployment and scalability.
☆21Jul 26, 2024Updated last year
Alternatives and similar repositories for e2e-structured-streaming
Users that are interested in e2e-structured-streaming are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Fully dockerized Data Warehouse (DWH) using Airflow, dbt, PostgreSQL and dashboard using redash☆25Nov 12, 2022Updated 3 years ago
- Apache Airflow advanced functionalities examples☆21Mar 22, 2024Updated 2 years ago
- End-to-End BI & DW project: Data Warehousing design and modeling (MySQL), ETL (PDI) and Dashboard (Tableau)☆16Aug 10, 2020Updated 5 years ago
- The goal of this project is to analyse the impact of Covid-19 on the Aviation industry through data engineering processes using technolog…☆13Jun 26, 2022Updated 3 years ago
- used Airflow, Postgres, Kafka, Spark, and Cassandra, and GitHub Actions to establish an end-to-end data pipeline☆32Oct 25, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- This is a demo project to compare two web scrapping frameworks, Playwright and Selenium and using the new Pipelining tool Dagster☆15Sep 9, 2021Updated 4 years ago
- ☆12Mar 6, 2021Updated 5 years ago
- A demonstration of an ELT (Extract, Load, Transform) pipeline☆31Feb 19, 2024Updated 2 years ago
- Cutting-edge, opinionated, and ambitious project builder for power users and researchers.☆16Feb 2, 2026Updated 3 months ago
- Đồ án tốt nghiệp | Data Lakehouse☆42Feb 9, 2026Updated 2 months ago
- A testing ground for Plotly Dash app development including app features and experimenting with dashboard visualizations.☆10Oct 15, 2023Updated 2 years ago
- ☆68Sep 24, 2025Updated 7 months ago
- NSCollectionView sample for OS X 10.11 ElCapitan☆12Nov 24, 2017Updated 8 years ago
- ☆10Feb 2, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A data pipeline moving data from a Relational database system (RDBMS) to a Hadoop file system (HDFS).☆15Jun 3, 2021Updated 4 years ago
- ☆11Aug 20, 2024Updated last year
- Glue ETL job or EMR Spark that gets from data catalog, modifies and uploads to S3 and Data Catalog☆13Aug 26, 2023Updated 2 years ago
- ☆13Sep 23, 2023Updated 2 years ago
- It is a assemble to include all Practice Projects about Big Data Topic, includes Hadoop, Spark, Spark Streaming and Kafka☆11Mar 7, 2019Updated 7 years ago
- A platform that helps developers to better understand CSS through declaration interpretation and may even improve them through suggestion…☆14Jul 3, 2021Updated 4 years ago
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆44Jan 4, 2024Updated 2 years ago
- ☆23Jul 8, 2025Updated 9 months ago
- Modern GIS Web Client for JavaScript, based on MapboxGL-JS, OpenLayers, Leaflet☆14Sep 16, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- TTS utility☆12Aug 2, 2020Updated 5 years ago
- End to end data engineering project with kafka, airflow, spark, postgres and docker.☆111Jan 8, 2026Updated 3 months ago
- ☆16Feb 11, 2026Updated 2 months ago
- Spark Notebook docker image☆10Dec 29, 2017Updated 8 years ago
- View data on a tile38 server☆14Aug 18, 2024Updated last year
- ☆16Nov 27, 2025Updated 5 months ago
- An example of a project generated with cookiecutter-uv☆15Apr 10, 2026Updated 3 weeks ago
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆65Jul 21, 2023Updated 2 years ago
- End-to-end data platform: A PoC Data Platform project utilizing modern data stack (Spark, Airflow, DBT, Trino, Lightdash, Hive metastore,…☆48Oct 14, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [SC2023] POMELO: Fine-grained Population Mapping from Coarse Census Counts and Open Geodata☆13Aug 5, 2024Updated last year
- This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project de…☆12Nov 18, 2023Updated 2 years ago
- 🚀 A simple javascript template for rapid development of GitHub actions.☆17Feb 24, 2023Updated 3 years ago
- ☆22Mar 15, 2011Updated 15 years ago
- ☆27Aug 28, 2023Updated 2 years ago
- Create and Run 🚀 Dotfiles projects for Windows 10/11☆23Jan 26, 2025Updated last year
- Stock Advisor☆12Jun 13, 2025Updated 10 months ago