Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
☆119Dec 15, 2023Updated 2 years ago
Alternatives and similar repositories for Real-time-Data-Warehouse
Users that are interested in Real-time-Data-Warehouse are comparing it to the libraries listed below
Sorting:
- Traditionally, engineers were needed to implement business logic via data pipelines before business users can start using it. Using this …☆12Updated this week
- A custom end-to-end analytics platform for customer churn☆11May 15, 2025Updated 9 months ago
- 汇总Apache Hudi中的一些Demo,便于快速上手Apache Hudi(Apache Hudi Demos to help beginners know about Hudi)☆74Sep 13, 2020Updated 5 years ago
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆66Sep 23, 2023Updated 2 years ago
- A sample implementation of stream writes to an Iceberg table on GCS using Flink and reading it using Trino☆22May 30, 2022Updated 3 years ago
- The Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are c…☆915Jan 12, 2026Updated last month
- Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.☆65Sep 26, 2023Updated 2 years ago
- 一个实时数仓项目,从0到1搭建实时数仓☆64May 27, 2021Updated 4 years ago
- 这是一个Flink实时数仓项目☆21Jul 28, 2022Updated 3 years ago
- Javascript library to talk to multiple OLAP backends from multiple frontends☆17Feb 4, 2013Updated 13 years ago
- Simple akka cluster example.☆12Mar 13, 2015Updated 10 years ago
- Examples of Flink on Azure☆55Oct 30, 2023Updated 2 years ago
- Code for Apache Hudi, Apache Iceberg and Delta Lake analysis☆10Feb 2, 2024Updated 2 years ago
- adidas Data Mesh implementation☆12May 13, 2022Updated 3 years ago
- Implementation of a Big Data (batch and stream) distributed processing engine in Java using Akka actors.☆12Feb 20, 2023Updated 3 years ago
- This plugin provides a useful feature for multi-language☆14Jul 15, 2022Updated 3 years ago
- 基于RED5流媒体服务器+ckplay实现的在线直播、视频☆14Mar 27, 2016Updated 9 years ago
- This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project de…☆11Nov 18, 2023Updated 2 years ago
- Showing the relationship between ImageNet ID and labels and pytorch pre-trained model output ID and labels☆10Oct 11, 2020Updated 5 years ago
- 使用spark对hive、hbase、ES的读写, 实现一次配置可对不同数据库进行导入导出,并对ES、hbase进行封装☆32May 6, 2017Updated 8 years ago
- This repository hosts materials for the Docker for Data Engineers workshop, offering hands-on exercises and resources tailored for data e…☆17May 23, 2024Updated last year
- Repository containing Docker images for Spark master and slave☆15Nov 3, 2019Updated 6 years ago
- ☆17Nov 26, 2024Updated last year
- Ecommerce Realtime Data Pipeline (Data Modeling, Workflow Orchestration, Change Data Capture, Analytical Database and Dashboarding)☆64Mar 9, 2024Updated 2 years ago
- ☆175Sep 5, 2023Updated 2 years ago
- 优化flink的多流操作(例如join),优化点不限于数据丢失问题,以及性能问题☆11Apr 8, 2019Updated 6 years ago
- Spring Boot Starter for Presto☆14Nov 18, 2018Updated 7 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated this week
- Adapter for dbt that executes dbt pipelines on Apache Flink☆96Mar 19, 2024Updated last year
- This project shows how to capture changes from postgres database and stream them into kafka☆41May 17, 2024Updated last year
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆302Feb 23, 2026Updated 2 weeks ago
- 基于flink的实时流计算web平台☆1,868Dec 2, 2025Updated 3 months ago
- Scalable CDC Pattern Implemented using PySpark☆18Oct 8, 2025Updated 5 months ago
- Apache Airflow advanced functionalities examples☆21Mar 22, 2024Updated last year
- classify crime into different categories using PySpark☆21May 20, 2019Updated 6 years ago
- This project provides a reverse proxy for Spark UI on Kubernetes☆17Oct 12, 2023Updated 2 years ago
- End-to-end data pipeline that ingests, processes, and stores data. It uses Apache Airflow to schedule scripts that fetch data from an API…☆21Jul 26, 2024Updated last year
- A data generator source connector for Flink SQL based on data-faker.☆235Jul 24, 2023Updated 2 years ago
- ☆118Apr 21, 2023Updated 2 years ago