☆133Jul 6, 2021Updated 4 years ago
Alternatives and similar repositories for ApacheSparkBook
Users that are interested in ApacheSparkBook are comparing it to the libraries listed below
Sorting:
- ☆11Feb 23, 2020Updated 6 years ago
- flink sql☆11Jun 21, 2022Updated 3 years ago
- 汇总Apache Hudi中的一些Demo,便于快速上手Apache Hudi(Apache Hudi Demos to help beginners know about Hudi)☆74Sep 13, 2020Updated 5 years ago
- ☆11Jul 18, 2021Updated 4 years ago
- A RPC framework leveraging Spark RPC module☆209Mar 13, 2019Updated 6 years ago
- 汇总Apache Hudi相关资料☆558Jan 4, 2026Updated last month
- A Spark Reliability Testing Suite☆13Jan 10, 2017Updated 9 years ago
- ☆15Jan 19, 2020Updated 6 years ago
- Profiling Spark Applications for Performance Comparison and Diagnosis☆17Nov 11, 2018Updated 7 years ago
- hudi 中文文档☆37Jan 9, 2020Updated 6 years ago
- Some resources about Ray Forward Meetup☆30Dec 25, 2025Updated 2 months ago
- 基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法☆2,061Feb 21, 2024Updated 2 years ago
- A Spark Atlas connector to track data lineage in Apache Atlas☆266Nov 16, 2022Updated 3 years ago
- presto 源码分析☆51Feb 2, 2018Updated 8 years ago
- 基于flink 1.8 源码二次开发,详见MD☆82May 20, 2020Updated 5 years ago
- ClickHouse原理解析与应用实践☆208Jun 15, 2020Updated 5 years ago
- Spark SQL listener to record lineage information☆28Jan 24, 2021Updated 5 years ago
- 【大数据成神之路学习路径+面经+简历】☆136Apr 13, 2022Updated 3 years ago
- Fast JVM collection☆60Mar 8, 2015Updated 10 years ago
- 剥离的模块,用于查看Spark SQL生成的语法树☆91May 26, 2019Updated 6 years ago
- An introduction of Scala learning and some frequently asked questions(FAQ);有关Scala的学习笔记,记录Scala的常用语法及标准库的部分设计原理☆26May 12, 2016Updated 9 years ago
- 大数据相关内容汇总,包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词:Hadoop、HBase、ES、Kudu、Hive、Presto、Spark、Flink、Kylin、ClickHouse☆231Dec 5, 2024Updated last year
- ☆29Jun 21, 2022Updated 3 years ago
- Clink is a library that provides APIs and infrastructure to facilitate the development of parallelizable feature engineering operators th…☆30Feb 21, 2022Updated 4 years ago
- [译] 数据结构思维中文版☆27Jan 2, 2021Updated 5 years ago
- Apache Flink 源码分析系列,基于 git tag 1.1.2☆233Feb 10, 2017Updated 9 years ago
- 酷玩 Spark: Spark 源代码解析、Spark 类库等☆3,482May 18, 2022Updated 3 years ago
- ☆30Sep 16, 2022Updated 3 years ago
- DataTunnel 是一个基于spark引擎的超高性能的分布式数据集成软件,支持海量数据的同步。基于spark extensions 扩展的DSL语法,结合的Spark SQL,更加便捷融入数仓 ETLT 过程中,简单易用。☆34Updated this week
- Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.☆2,304Feb 23, 2026Updated last week
- Spark ClickHouse Connector build on DataSourceV2 API☆212Feb 20, 2026Updated last week
- flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Ta…☆15,048Mar 12, 2025Updated 11 months ago
- Spark-2.3.1源码解读☆201Dec 5, 2022Updated 3 years ago
- Apache Calcite☆5,077Updated this week
- A data integration framework☆4,110Dec 2, 2025Updated 3 months ago
- Uniffle is a high performance, general purpose Remote Shuffle Service.☆445Updated this week
- 《Spark: The Definitive Guide Big Data Processing Made Simple》学习心得,说翻译嘛也不算完全翻译吧,只能说以个人经验和理解重新叙述一遍。同步更新在掘金上,点链接可跳转☆36Aug 4, 2019Updated 6 years ago
- ☆227Feb 24, 2026Updated last week
- Apache Flink shaded artifacts repository☆140Jan 12, 2026Updated last month