【2025最新版】 大数据 数据分析 电商系统 实时数仓 离线数仓 数据湖 建设方案及实战代码,涉及组件 #flink #paimon #doris #seatunnel #dolphinscheduler #datart #dinky #hudi #iceberg。
☆1,055Oct 8, 2025Updated 4 months ago
Alternatives and similar repositories for data-warehouse-learning
Users that are interested in data-warehouse-learning are comparing it to the libraries listed below
Sorting:
- 该项目整合了多款优秀的开源产品,构建了一个功能全面的数据开发平台。平台提供了强大的数据集成、数据开发、数据查询、数据服务、数据质量管理、工作流调度和元数据管理功能。#dinky #dolphinscheduler #datavines #flinkcdc #openmeta…☆624Aug 5, 2025Updated 6 months ago
- 大数据组件学习代码☆65May 6, 2024Updated last year
- Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.☆721Feb 3, 2026Updated last month
- Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.☆3,696Dec 19, 2025Updated 2 months ago
- 🔥🔥 AllData可定义数据中台,以数据平台为底座,以数据中台为桥梁,以机器学习平台为工厂,以大模型应用为上游产品,提供全链路数字化解决方案。产品正式演示体验、社群咨询、商务采购:https://docs.qq.com/doc/DVHlkSEtvVXVCdEFo☆2,980Updated this week
- The next generation of cloud-native big data management expert , Aims to help users rapidly build stable, efficient, and scalable cloud-n…☆1,307Jul 22, 2025Updated 7 months ago
- Support agile DataOps Based on Flink, DataX and Flink-CDC, Chunjun with Web-UI☆1,283Feb 18, 2026Updated 2 weeks ago
- 数据建设与大数据技术知识体系,包含hadoop、hive、spark、flink主流框架和系列框架,数据中台、数据湖、数据治理、数仓建设、数据化转型等☆442Aug 8, 2025Updated 6 months ago
- 大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等。☆1,728Feb 12, 2026Updated 2 weeks ago
- Make stream processing easier! Easy-to-use streaming application development framework and operation platform.☆4,299Feb 24, 2026Updated last week
- LarkMidTable 是一站式开源的数据中台,实现中台的 基础建设,数据治理,数据开发,监控告警,数据服务,数据的可视化,实现高效赋能数据前台并提供数据服务的产品。☆2,029Aug 20, 2023Updated 2 years ago
- 该仓库专注于让读者秒懂Flink组件,包含Flink实战代码和文档、200个Flink教程知识点,Flink Datastream、Flink Table、Flink Window、Flink State、Flink Checkpoint、Flink Metrics、Fli…☆762Jun 14, 2024Updated last year
- SeaTunnel is a multimodal, high-performance, distributed, massive data integration tool.☆9,132Updated this week
- SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offlin…☆794Jan 22, 2026Updated last month
- ☆468Sep 17, 2022Updated 3 years ago
- A data integration framework☆4,110Dec 2, 2025Updated 3 months ago
- Flink CDC is a streaming data integration tool☆6,364Updated this week
- Open data platform based on Kubernetes. Scaleph supports SeaTunnel、Flink and Doris backended by SeaTunnel on Flink engine、Flink Kubernete…☆399Dec 17, 2025Updated 2 months ago
- Doris表和字段血缘项目☆88Apr 30, 2024Updated last year
- CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source bi…☆489Oct 31, 2025Updated 4 months ago
- Ultra-Lightweight AI-Powered Big Data Center | 至轻云-超轻量级智能化大数据中心/数据中台☆247Updated this week
- 这个平台旨在提供一个高效、便捷的数据处理和分析环境,适用于数据科学家、数据工程师以及任何对数据处理有需求的用户。☆55Aug 5, 2025Updated 6 months ago
- 一个实时数仓项目,从0到1搭建实时数仓☆64May 27, 2021Updated 4 years ago
- Apache Fluss is a streaming storage built for real-time analytics.☆1,801Updated this week
- Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch …☆3,200Updated this week
- 专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...☆10,426Aug 7, 2023Updated 2 years ago
- Apache Doris MCP Server☆260Dec 24, 2025Updated 2 months ago
- flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Ta…☆15,048Mar 12, 2025Updated 11 months ago
- Apache Doris is an easy-to-use, high performance and unified analytics database.☆15,071Updated this week
- 通用数据生成平台☆13Mar 11, 2025Updated 11 months ago
- Qualitis is a one-stop data quality management platform that supports quality verification, notification, and management for various data…☆763Jan 21, 2026Updated last month
- 解析 SQL 字段数据血缘☆96Apr 17, 2025Updated 10 months ago
- 大数据采集,抽取平台,zdh_web是zdh系列服务的可视化管理平台,包含数据采集,调度,权限,审批流,私域营销等模块☆528Jan 24, 2026Updated last month
- SuperSonic is the next-generation AI+BI platform that unifies Chat BI (powered by LLM) and Headless BI (powered by semantic layer) paradi…☆4,714Updated this week
- 智数通2.0是新一代完全自主研发的数据治理平台,现拥有数据建设平台、数据治理平台、数据服务平台、任务调度平台等四大基础数据治理平台, 实现了数据集成、元数据管理、数据标准管理、数据质量管理、数据服务管理、数据建模管理、数据血缘查看、数据资产管理、任务调度管理等功能模块, 打…☆100Apr 27, 2025Updated 10 months ago
- 基于flink的实时流计算web平台☆1,869Dec 2, 2025Updated 3 months ago
- Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code☆14,175Updated this week
- FlinkSQL数据脱敏和行级权限解决方案及源码,支持面向用户级别的数据脱敏和行级数据访问控制,即特定用户只能访问到脱敏后的数据或授权过的行。此方案是实时领域Flink的解决方案,类似于离线数仓Hive Ranger中的Row-level Filter和Column Mas…☆146Oct 12, 2023Updated 2 years ago
- DataX集成可视化页面,选择数据源即可一键生成数据同步任务,支持RDBMS、Hive、HBase、ClickHouse、MongoDB等数据源,批量创建RDBMS数据同步任务,集成开源调度系统,支持分布式、增量同步数据、实时查看运行日志、监控执行器资源、KILL运行进程、…☆5,984Jun 2, 2024Updated last year