alibaba/SparkCube

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/alibaba/SparkCube)

alibaba / SparkCube

SparkCube is an open-source project for extremely fast OLAP data analysis. SparkCube is an extension of Apache Spark.

☆136

Alternatives and similar repositories for SparkCube

Users that are interested in SparkCube are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

oap-project / sql-ds-cache
View on GitHub
Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.
☆37Jan 3, 2023Updated 3 years ago
lightcopy / parquet-index
View on GitHub
Spark SQL index for Parquet tables
☆134May 6, 2021Updated 5 years ago
aistack / sql-booster
View on GitHub
This is a library for SQL optimizing/rewriting including Materialized View rewrite
☆70Jun 21, 2022Updated 4 years ago
apache / kyuubi-client
View on GitHub
Client libraries of end users of Apache Kyuubi
☆11May 15, 2026Updated 2 months ago
hbutani / icebergSQL
View on GitHub
Integration of Iceberg table management into Spark SQL
☆11Jan 21, 2020Updated 6 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
yaooqinn / spark-ranger
View on GitHub
已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.
☆59Nov 11, 2021Updated 4 years ago
apache / celeborn
View on GitHub
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
☆1,056Updated this week
apache / kyuubi
View on GitHub
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
☆2,353Updated this week
yaooqinn / spark-authorizer
View on GitHub
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark | This repo is contributed to Apache Kyuubi | 项目已迁移至 Apa…
☆183Apr 6, 2022Updated 4 years ago
apache / paimon-vector-index
View on GitHub
Apache Paimon Vector Index: pure Rust IVF-PQ for data lake vector search.
☆18Jul 10, 2026Updated last week
JoshRosen / hive
View on GitHub
Mirror of Apache Hive
☆33Mar 16, 2020Updated 6 years ago
yaooqinn / spark-history-cli
View on GitHub
CLI tool for querying Apache Spark History Server REST API
☆28Mar 22, 2026Updated 3 months ago
allwefantasy / delta-plus
View on GitHub
A library based on delta for Spark and MLSQL
☆60Dec 24, 2020Updated 5 years ago
sutugin / spark-streaming-jdbc-source
View on GitHub
☆26Apr 15, 2021Updated 5 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
oracle / spark-oracle
View on GitHub
On the fly, translation of Spark programs to run natively on your Oracle DB. Your Spark programs require no changes.
☆36Apr 15, 2025Updated last year
passionke / starry
View on GitHub
fast spark local mode
☆35Aug 20, 2018Updated 7 years ago
oap-project / gazelle_plugin
View on GitHub
Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.
☆255Feb 21, 2023Updated 3 years ago
uber / RemoteShuffleService
View on GitHub
Remote shuffle service for Apache Spark to store shuffle data on remote servers.
☆335Sep 29, 2023Updated 2 years ago
apache / gluten
View on GitHub
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
☆1,576Updated this week
Tencent / Firestorm
View on GitHub
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark and Apache Hadoop MapReduce applications to store shu…
☆256Apr 7, 2023Updated 3 years ago
oap-project / remote-shuffle
View on GitHub
Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-dis…
☆21Mar 15, 2024Updated 2 years ago
allwefantasy / spark-binlog
View on GitHub
A library for querying Binlog with Apache Spark structure streaming, for Spark SQL , DataFrames and [MLSQL](https://www.mlsql.tech).
☆152Apr 21, 2023Updated 3 years ago
boostscale / velox4j
View on GitHub
Community Java bindings for https://github.com/facebookincubator/velox
☆43Updated this week
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
awesome-kyuubi / hadoop-testing
View on GitHub
Testing Sandbox for Hadoop Ecosystem Components
☆45Jun 16, 2026Updated last month
cjuexuan / mynote
View on GitHub
☆233Sep 15, 2022Updated 3 years ago
streamnative / pulsar-spark
View on GitHub
Spark Connector to read and write with Pulsar
☆120May 26, 2026Updated last month
bebee4java / ides
View on GitHub
智能数据探索服务(Intelligent Data Exploration Service)，一站式Data + AI数据解决方案！
☆36Jul 10, 2023Updated 3 years ago
linkedin / coral
View on GitHub
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
☆907Updated this week
squito / spark-memory
View on GitHub
A tool to get better debug info on spark's memory usage
☆42Aug 21, 2019Updated 6 years ago
zhoufwind / palo-deploy
View on GitHub
使用shell脚本部署Apache Doris (incubating) FE & BE
☆11Jul 8, 2019Updated 7 years ago
teeyog / IQL
View on GitHub
An ad hoc query service based on the spark sql engine.(基于spark sql引擎的即席查询服务)
☆377Dec 16, 2023Updated 2 years ago
byzer-org / byzer-lang
View on GitHub
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
☆1,835May 29, 2024Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
oap-project / pmem-shuffle
View on GitHub
Spark* Shuffle plugin for support shuffling through remote persistent memory over fabrics, which leverages the RDMA network and remote pe…
☆14Sep 18, 2023Updated 2 years ago
bebee4java / sqlalarm
View on GitHub
Big data smart alarm by sql
☆12May 11, 2021Updated 5 years ago
aliyun / aliyun-emapreduce-datasources
View on GitHub
Extended datasource support for Spark/Hadoop on Aliyun E-MapReduce.
☆170Nov 30, 2023Updated 2 years ago
maropu / spark-tpcds-datagen
View on GitHub
All the things about TPC-DS in Apache Spark
☆111Jun 15, 2023Updated 3 years ago
NetEase / spark-alarm
View on GitHub
Alerting and monitoring tool for Apache Spark
☆23May 20, 2022Updated 4 years ago
yaooqinn / spark-postgres
View on GitHub
PostgreSQL and GreenPlum Data Source for Apache Spark
☆35May 6, 2026Updated 2 months ago
lhbench / lhbench
View on GitHub
Lakehouse storage system benchmark
☆82Feb 22, 2023Updated 3 years ago