dnguyenngoc / real-time-analyticLinks
This repo gives an introduction to setting up streaming analytics using open source technologies
☆25Updated 2 years ago
Alternatives and similar repositories for real-time-analytic
Users that are interested in real-time-analytic are comparing it to the libraries listed below
Sorting:
- Nyc_Taxi_Data_Pipeline - DE Project☆116Updated 9 months ago
- In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our …☆33Updated last year
- End-to-end data platform leveraging the Modern data stack☆51Updated last year
- ELT Data Pipeline implementation in Data Warehousing environment☆26Updated 3 months ago
- A MLOps platform using prefect, mlflow, FastAPI, Prometheus/Grafana und streamlit☆87Updated 2 years ago
- ☆41Updated last year
- Simple stream processing pipeline☆103Updated last year
- This project implements an ELT (Extract - Load - Transform) data pipeline with the goodreads dataset, using dagster (orchestration), spar…☆36Updated 2 years ago
- Transaction processing & vis pipeline using PySpark Streaming☆30Updated last year
- To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a…☆39Updated last year
- "1 config, 1 command from Jupyter Notebook to serve Millions of users", Full-stack On-Premises MLOps system for Computer Vision from Data…☆46Updated 11 months ago
- ☆61Updated 11 months ago
- Code snippets for Data Engineering Design Patterns book☆142Updated 4 months ago
- Spark all the ETL Pipelines☆33Updated 2 years ago
- Project for real-time anomaly detection using Kafka and python☆58Updated 2 years ago
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆38Updated last year
- Data pipeline that scrapes Rust cheater Steam profiles☆52Updated 3 years ago
- NLP/LLM Mlops Pipeline to dev/train/evaluation, scalable deploy and monitoring systems.☆22Updated last year
- Template for data pipelines, ML workflows, API dev and monitoring☆45Updated last year
- Spark, Airflow, Kafka☆26Updated 2 years ago
- Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize a…☆32Updated last year
- Cost Efficient Data Pipelines with DuckDB☆56Updated 2 months ago
- Code for "Efficient Data Processing in Spark" Course☆326Updated 2 months ago
- Classwork projects and home works done through Udacity data engineering nano degree☆74Updated last year
- Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra☆141Updated 2 years ago
- Tutorials/use cases of using Prefect in an ML project.☆44Updated 2 years ago
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆41Updated last year
- reating a modern data pipeline using a combination of Terraform, AWS Lambda and S3, Snowflake, DBT, Mage AI, and Dash.☆14Updated 2 years ago
- build dw with dbt☆48Updated 9 months ago
- This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessar…☆44Updated last year