This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenAI LLM, Kafka and Elasticsearch. It covers each stage from data acquisition, processing, sentiment analysis with ChatGPT, production to kafka topic and connection to elasticsearch.
☆45Jan 4, 2024Updated 2 years ago
Alternatives and similar repositories for RealtimeStreamingEngineering
Users that are interested in RealtimeStreamingEngineering are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This project showcases how to integrate the world of DevOps, focusing on Continuous Integration (CI) and Continuous Deployment (CD) with …☆14Dec 27, 2023Updated 2 years ago
- This project provides an end-to-end data processing and visualization of visa numbers in Japan using PySpark and Plotly. The spark cluste…☆11Oct 11, 2023Updated 2 years ago
- This project shows how to capture changes from postgres database and stream them into kafka☆42May 17, 2024Updated 2 years ago
- An end-to-end data engineering pipeline that fetches real-time YouTube analytics and streams them through Kafka for processing with ksqlD…☆16Sep 19, 2023Updated 2 years ago
- In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our …☆39Dec 18, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- This repository hosts materials for the Docker for Data Engineers workshop, offering hands-on exercises and resources tailored for data e…☆17May 23, 2024Updated 2 years ago
- This course is designed to provide learners with the fundamental skills needed for data engineering using Python. The objective is to int…☆34Aug 15, 2024Updated last year
- This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python…☆48Mar 14, 2024Updated 2 years ago
- Ecommerce Realtime Data Pipeline (Data Modeling, Workflow Orchestration, Change Data Capture, Analytical Database and Dashboarding)☆69Mar 9, 2024Updated 2 years ago
- This is an end to end MLOps system☆34Nov 27, 2025Updated 7 months ago
- This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data wareh…☆217Oct 23, 2023Updated 2 years ago
- FastAPI CLI is a command-line tool designed to help developers quickly generate a structured project file system for FastAPI applications…☆12Feb 3, 2025Updated last year
- NoSQL extract, transform, load (ETL) toolkit with Python☆16Jun 21, 2026Updated last week
- ☆10Sep 9, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆12Apr 17, 2023Updated 3 years ago
- Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake developme…☆12Feb 26, 2020Updated 6 years ago
- Apache Airflow advanced functionalities examples☆21Mar 22, 2024Updated 2 years ago
- (Python, PySpark)☆11Nov 15, 2020Updated 5 years ago
- Architected a SQL-based Superstore Data Warehouse to analyse trends of customer records to identify profitable market segments and design…☆12Sep 24, 2024Updated last year
- ELT Data Pipeline implementation in Data Warehousing environment☆30May 2, 2025Updated last year
- ☆14Oct 17, 2023Updated 2 years ago
- ☆11Sep 16, 2021Updated 4 years ago
- End-to-end data pipeline that ingests, processes, and stores data. It uses Apache Airflow to schedule scripts that fetch data from an API…☆21Jul 26, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆51Dec 2, 2023Updated 2 years ago
- an end-to-end data pipeline extracting music listening habits and producing an insightful dashboard☆18Mar 31, 2024Updated 2 years ago
- This is a simple iris flower classification model deployment project as flask app on Docker or Kubernetes.☆11Feb 16, 2022Updated 4 years ago
- Spark Structured Streaming data pipeline that processes movie ratings data in real-time.☆14Apr 15, 2026Updated 2 months ago
- A Multi-branch CI-CD Pipeline Using Jenkins, Docker, AWS, Maven To Deploy an Odoo ERP custom module & a simple Java Maven web app.☆13Dec 23, 2022Updated 3 years ago
- 🌟 An end-to-end full-stack data science project, including modelling, MLOps, and data storytelling. ✨☆16Aug 30, 2025Updated 9 months ago
- ☆15Apr 18, 2024Updated 2 years ago
- ☆59Aug 14, 2024Updated last year
- Analytics engineering with dbt - projects and developer environment☆22Sep 27, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This is initial Development☆14Dec 7, 2020Updated 5 years ago
- End to end data engineering project with kafka, airflow, spark, postgres and docker.☆115Jan 8, 2026Updated 5 months ago
- Repository to host micro service implementation patterns.☆14Jun 25, 2025Updated last year
- StarCraft 2 Data Pipeline with Airflow, DuckDB and Streamlit☆16Mar 14, 2024Updated 2 years ago
- End to end data pipeline to extract and analyze submissions from any subreddit using Pushshift, python, dbt and BigQuery.☆12Jul 17, 2023Updated 2 years ago
- Sample project to demonstrate data engineering best practices☆220Feb 24, 2024Updated 2 years ago
- In this project I have built etl pipline which scraps the trending repository based on month,week and day LIVE extract other related info…☆12Sep 9, 2023Updated 2 years ago