qubole/rubix

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/qubole/rubix)

qubole / rubix

Cache File System optimized for columnar formats and object stores

☆188

Alternatives and similar repositories for rubix

Users that are interested in rubix are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

qubole / quark
View on GitHub
Quark is a data virtualization engine over analytic databases.
☆101Jul 13, 2017Updated 9 years ago
lyft / presto-gateway
View on GitHub
A load balancer / proxy / gateway for prestodb
☆359Jul 25, 2024Updated last year
qubole / presto-udfs
View on GitHub
Plugin for Presto to allow addition of user functions easily
☆119Mar 31, 2021Updated 5 years ago
qubole / sparklens
View on GitHub
Qubole Sparklens tool for performance tuning Apache Spark
☆592Jun 26, 2024Updated 2 years ago
lightcopy / parquet-index
View on GitHub
Spark SQL index for Parquet tables
☆134May 6, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
linkedin / coral
View on GitHub
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
☆907Updated this week
linkedin / transport
View on GitHub
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…
☆306Jun 29, 2026Updated 3 weeks ago
bullet-db / bullet-core
View on GitHub
Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…
☆42Dec 14, 2022Updated 3 years ago
vlsi / calcite-test-dataset
View on GitHub
Data sets and Vagrant script to provision a virtual machine for Apache Calcite development
☆30Mar 24, 2023Updated 3 years ago
oap-project / remote-shuffle
View on GitHub
Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-dis…
☆21Mar 15, 2024Updated 2 years ago
GiraffaFS / giraffa
View on GitHub
Giraffa FileSystem (Slack: giraffa-fs.slack.com)
☆18Mar 8, 2017Updated 9 years ago
qubole / spark-on-lambda
View on GitHub
Apache Spark on AWS Lambda
☆158Dec 5, 2022Updated 3 years ago
MemVerge / splash
View on GitHub
Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
☆131Dec 19, 2024Updated last year
qubole / spark-acid
View on GitHub
ACID Data Source for Apache Spark based on Hive ACID
☆97Jul 7, 2021Updated 5 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
criteo / garmadon
View on GitHub
Java event logs collector for hadoop and frameworks
☆42Mar 25, 2025Updated last year
hdinsight / tpcds-hdinsight
View on GitHub
TPCDS benchmark for various engines
☆18Feb 11, 2022Updated 4 years ago
prestodb / tempto
View on GitHub
A testing framework for Presto
☆63Jun 8, 2026Updated last month
ExpediaGroup / circus-train
View on GitHub
Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
☆93Mar 5, 2024Updated 2 years ago
amplab / velox-modelserver
View on GitHub
☆110Apr 17, 2017Updated 9 years ago
prestodb / benchto
View on GitHub
Framework for running macro benchmarks in a clustered environment
☆25Aug 29, 2022Updated 3 years ago
microsoft / hyperspace
View on GitHub
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
☆430Jan 14, 2022Updated 4 years ago
maropu / spark-sql-server
View on GitHub
Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol
☆34Sep 8, 2022Updated 3 years ago
apache / gluten
View on GitHub
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
☆1,576Updated this week
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
ottogroup / schedoscope
View on GitHub
Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or what…
☆98Nov 14, 2019Updated 6 years ago
uber / RemoteShuffleService
View on GitHub
Remote shuffle service for Apache Spark to store shuffle data on remote servers.
☆335Sep 29, 2023Updated 2 years ago
apache / orc-format
View on GitHub
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
☆16May 15, 2026Updated 2 months ago
airlift / discovery
View on GitHub
Discovery Server
☆56May 7, 2024Updated 2 years ago
uber / uberscriptquery
View on GitHub
UberScriptQuery, a SQL-like DSL to make writing Spark jobs super easy
☆65Dec 17, 2023Updated 2 years ago
SaurabhChawla100 / spark-radiant
View on GitHub
Spark-Radiant is Apache Spark Performance and Cost Optimizer
☆25Dec 31, 2024Updated last year
kj-ki / tpc-h-impala
View on GitHub
TPC-H Benchmark on Cloudera Impala
☆19Apr 25, 2013Updated 13 years ago
airlift / slice
View on GitHub
Java library for efficiently working with flat heap memory
☆518Updated this week
prestodb / presto-admin
View on GitHub
A tool to install, configure and manage Presto installations
☆172Dec 27, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
yahoo / maha
View on GitHub
A framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.
☆133Jan 17, 2025Updated last year
falarica / steerd-presto-operator
View on GitHub
Kubernetes (K8s) Operator for PrestoDB
☆46Sep 29, 2021Updated 4 years ago
splicemachine / spliceengine
View on GitHub
The SpliceSQL Engine
☆172Jun 15, 2023Updated 3 years ago
ExpediaGroup / waggle-dance
View on GitHub
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
☆288Jun 25, 2026Updated 3 weeks ago
CODAIT / stocator
View on GitHub
Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.
☆115May 17, 2024Updated 2 years ago
xerial / presto-metrics
View on GitHub
Presto metric collection library for Ruby
☆26Jul 1, 2026Updated 2 weeks ago
qubole / streamx
View on GitHub
kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)
☆96Apr 4, 2019Updated 7 years ago