izhangzhihao / Real-time-Data-WarehouseView external linksLinks
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
☆119Dec 15, 2023Updated 2 years ago
Alternatives and similar repositories for Real-time-Data-Warehouse
Users that are interested in Real-time-Data-Warehouse are comparing it to the libraries listed below
Sorting:
- A custom end-to-end analytics platform for customer churn☆11May 15, 2025Updated 9 months ago
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆65Sep 23, 2023Updated 2 years ago
- A sample implementation of stream writes to an Iceberg table on GCS using Flink and reading it using Trino☆22May 30, 2022Updated 3 years ago
- The Apache Flink SQL Cookbook is a curated collection of examples, patterns, and use cases of Apache Flink SQL. Many of the recipes are c…☆914Jan 12, 2026Updated last month
- Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.☆65Sep 26, 2023Updated 2 years ago
- ☆11Nov 26, 2024Updated last year
- 一个实时数仓项目,从0到1搭建实时数仓☆64May 27, 2021Updated 4 years ago
- Code for Apache Hudi, Apache Iceberg and Delta Lake analysis☆10Feb 2, 2024Updated 2 years ago
- This plugin provides a useful feature for multi-language☆14Jul 15, 2022Updated 3 years ago
- Hadoop cluster on Docker☆11Jul 19, 2016Updated 9 years ago
- 基于RED5流媒体服务器+ckplay实现的在线直播、视频☆14Mar 27, 2016Updated 9 years ago
- This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project de…☆11Nov 18, 2023Updated 2 years ago
- Showing the relationship between ImageNet ID and labels and pytorch pre-trained model output ID and labels☆10Oct 11, 2020Updated 5 years ago
- Self-contained demo using Flink SQL and Debezium to build a CDC-based analytics pipeline. All you need is Docker!☆26May 11, 2021Updated 4 years ago
- 使用spark对hive、hbase、ES的读写, 实现一次配置可对不同数据库进行导入导出,并对ES、hbase进行封装☆33May 6, 2017Updated 8 years ago
- Apache flink☆18Feb 8, 2023Updated 3 years ago
- Repository containing Docker images for Spark master and slave☆15Nov 3, 2019Updated 6 years ago
- ☆175Sep 5, 2023Updated 2 years ago
- 优化flink的多流操作(例如join),优化点不限于数据丢失问题,以及性能问题☆11Apr 8, 2019Updated 6 years ago
- Spring Boot Starter for Presto☆14Nov 18, 2018Updated 7 years ago
- Adapter for dbt that executes dbt pipelines on Apache Flink☆96Mar 19, 2024Updated last year
- This project shows how to capture changes from postgres database and stream them into kafka☆41May 17, 2024Updated last year
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆299Feb 10, 2026Updated last week
- 基于flink的实时流计算web平台☆1,869Dec 2, 2025Updated 2 months ago
- Scalable CDC Pattern Implemented using PySpark☆18Oct 8, 2025Updated 4 months ago
- Apache Airflow advanced functionalities examples☆21Mar 22, 2024Updated last year
- This project provides a reverse proxy for Spark UI on Kubernetes☆17Oct 12, 2023Updated 2 years ago
- classify crime into different categories using PySpark☆21May 20, 2019Updated 6 years ago
- A data generator source connector for Flink SQL based on data-faker.☆234Jul 24, 2023Updated 2 years ago
- ☆118Apr 21, 2023Updated 2 years ago
- 汇总Apache Hudi相关资料☆558Jan 4, 2026Updated last month
- THIS REPOSITORY IS DEPRECATED☆19Jul 6, 2023Updated 2 years ago
- Yet Another (Spark) ETL Framework☆21Oct 21, 2023Updated 2 years ago
- Protobuf serialization support for Apache Flink☆21Jun 1, 2021Updated 4 years ago
- AI 时代的智能数据库☆221Nov 9, 2023Updated 2 years ago
- Streaming left joins in Kafka for change data capture☆52Jan 7, 2026Updated last month
- Apache Amoro(incubating) is a Lakehouse management system built on open data lake formats.☆1,110Updated this week
- Machine learning library of Apache Flink☆328Nov 4, 2024Updated last year
- ☆23Nov 17, 2022Updated 3 years ago