airscholar / ApacheFlink-SalesAnalytics
This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project demonstrates how to ingest, process, and analyze sales data, showcasing the capabilities of Apache Flink for big data processing.
☆11Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for ApacheFlink-SalesAnalytics
- This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessar…☆37Updated 11 months ago
- This project shows how to capture changes from postgres database and stream them into kafka☆31Updated 5 months ago
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆28Updated 10 months ago
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆29Updated 10 months ago
- In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our …☆21Updated 10 months ago
- ☆27Updated 11 months ago
- This repository contains the necessary configuration files and DAGs (Directed Acyclic Graphs) for setting up a robust data engineering en…☆15Updated 9 months ago
- Simple stream processing pipeline☆91Updated 4 months ago
- An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Ka…☆196Updated last year
- ☆60Updated this week
- End to end data engineering project☆49Updated 2 years ago
- ☆86Updated 2 years ago
- Ultimate guide for mastering Spark Performance Tuning and Optimization concepts and for preparing for Data Engineering interviews☆67Updated 5 months ago
- data-warehouse-snowflake-for-data-engineering☆14Updated last year
- A custom end-to-end data pipeline for customer churn☆9Updated last week
- ☆42Updated 3 years ago
- Content related to Mastering Postgresql along with videos.☆14Updated 3 years ago
- This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data wareh…☆61Updated last year
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆37Updated last year
- ☆27Updated 11 months ago
- The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Pos…☆53Updated last year
- Data Engineering, Data Warehouse, Data Mart, Cloud Data, AWS, SAS, Redshift, S3☆25Updated 3 years ago
- Data Engineering on GCP☆30Updated 2 years ago
- A course by DataTalks Club that covers Spark, Kafka, Docker, Airflow, Terraform, DBT, Big Query etc☆11Updated 2 years ago
- Simple ETL pipeline using Python☆20Updated last year
- This repo is for the Linkedin Learning course: End-to-End Data Engineering Project☆15Updated last year
- I am using confluent Kafka cluster to produce and consume scraped data. In this project, I've created a real-time data pipeline that uti…☆28Updated last year
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆57Updated last year
- End to end data engineering project with kafka, airflow, spark, postgres and docker.☆64Updated 3 months ago
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆55Updated 5 months ago