airscholar / ApacheFlink-SalesAnalyticsLinks
This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project demonstrates how to ingest, process, and analyze sales data, showcasing the capabilities of Apache Flink for big data processing.
☆11Updated 2 years ago
Alternatives and similar repositories for ApacheFlink-SalesAnalytics
Users that are interested in ApacheFlink-SalesAnalytics are comparing it to the libraries listed below
Sorting:
- This project shows how to capture changes from postgres database and stream them into kafka☆38Updated last year
- This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessar…☆46Updated last year
- Simple stream processing pipeline☆110Updated last year
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆42Updated last year
- In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our …☆37Updated last year
- ☆19Updated last year
- Analytics engineering with dbt - projects and developer environment☆21Updated last year
- This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python…☆46Updated last year
- Apache Airflow advanced functionalities examples☆21Updated last year
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆43Updated last year
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆75Updated 2 years ago
- Code for blog at: https://www.startdataengineering.com/post/docker-for-de/☆40Updated last year
- A custom end-to-end analytics platform for customer churn☆11Updated 6 months ago
- An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Ka…☆290Updated 9 months ago
- End-to-end data platform: A PoC Data Platform project utilizing modern data stack (Spark, Airflow, DBT, Trino, Lightdash, Hive metastore,…☆47Updated last year
- This repository contains the necessary configuration files and DAGs (Directed Acyclic Graphs) for setting up a robust data engineering en…☆23Updated last year
- Ultimate guide for mastering Spark Performance Tuning and Optimization concepts and for preparing for Data Engineering interviews☆177Updated 2 months ago
- Repo for everything open table formats (Iceberg, Hudi, Delta Lake) and the overall Lakehouse architecture☆124Updated 2 weeks ago
- ☆15Updated last year
- Sample project to demonstrate data engineering best practices☆200Updated last year
- Repository for Data Engineering Interview Series☆33Updated last year
- A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.☆79Updated 2 years ago
- End to end data engineering project☆57Updated 3 years ago
- An End-to-End ETL data pipeline that leverages pyspark parallel processing to process about 25 million rows of data coming from a SaaS ap…☆25Updated 2 years ago
- Code snippets for Data Engineering Design Patterns book☆275Updated 8 months ago
- This project helps me to understand the core concepts of Apache Airflow. I have created custom operators to perform tasks such as staging…☆96Updated 6 years ago
- ☆21Updated 2 years ago
- Code for dbt tutorial☆165Updated 2 months ago
- ☆46Updated 4 years ago
- A demonstration of an ELT (Extract, Load, Transform) pipeline☆31Updated last year