DucAnhNTT / bigdata-ETL-pipelineLinks
The Data Pipeline and Analytics Stack is a comprehensive solution designed for processing, storing, and visualizing data. Explore a complete data pipeline with all components seamlessly set up and ready to use
☆17Updated 2 years ago
Alternatives and similar repositories for bigdata-ETL-pipeline
Users that are interested in bigdata-ETL-pipeline are comparing it to the libraries listed below
Sorting:
- A Postgres data warehouse for processing synthetic data using IAC principles☆20Updated 2 years ago
- ☆45Updated last year
- Full stack data engineering tools and infrastructure set-up☆57Updated 4 years ago
- Content related to Mastering Postgresql along with videos.☆18Updated 4 years ago
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆43Updated 2 years ago
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆44Updated 2 years ago
- ☆32Updated 2 years ago
- This repo is meant to make it really easy to analyze the interplays between health and social media use.☆47Updated 3 years ago
- A demo of the Mito Streamlit Spreadsheet☆18Updated 2 years ago
- A end-to-end real-time stock market data pipeline with Python, AWS EC2, Apache Kafka, and Cassandra Data is processed on AWS EC2 with Apa…☆27Updated 2 years ago
- ☆10Updated 3 years ago
- This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessar…☆46Updated 2 years ago
- Project for real-time anomaly detection using Kafka and python☆58Updated 3 years ago
- Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize a…☆35Updated last month
- Building ETL Pipelines with Python☆172Updated last year
- Data engineering project using UK Bus Open Data Service (BODS) to calculate late buses in real-time for any selected region in England. P…☆30Updated 2 years ago
- This is my Apache Airflow Local development setup on Windows 10 WSL2/Mac using docker-compose. It will also include some sample DAGs and …☆34Updated last year
- Cost Efficient Data Pipelines with DuckDB☆61Updated 8 months ago
- A repository to create a quick sales application☆15Updated 8 months ago
- A pipeline to detect data drift and retrain the model when there is drift☆24Updated 2 years ago
- A MLOps platform using prefect, mlflow, FastAPI, Prometheus/Grafana und streamlit☆96Updated 3 years ago
- Apache Spark Guide☆35Updated 3 years ago
- A Machine Learning Pipeline built with MLflow, Prefect, BentoML, Streamlit and Evidently.☆28Updated 3 years ago
- Duke MIDS: Data Engineering and DataOps Course☆69Updated last year
- The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on…☆28Updated 3 years ago
- The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such …☆123Updated 3 years ago
- An End-to-End ETL data pipeline that leverages pyspark parallel processing to process about 25 million rows of data coming from a SaaS ap…☆25Updated 3 years ago
- A simple pipeline utilising cron, Postgres, AWS EC2, and Metabase☆12Updated last year
- Udacity Data Streaming Nanodegree Program☆24Updated 4 years ago
- A guide to show you how to import data for ETL☆21Updated 3 years ago