linkedin / venice
Venice, Derived Data Platform for Planet-Scale Workloads.
☆526Updated this week
Alternatives and similar repositories for venice:
Users that are interested in venice are comparing it to the libraries listed below
- This is the companion repository for the book How Query Engines Work.☆384Updated last year
- Apache DataFusion Comet Spark Accelerator☆922Updated this week
- New file format for storage of large columnar datasets.☆497Updated last week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,164Updated this week
- Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.☆829Updated last month
- Apache Iceberg☆856Updated this week
- This is RonDB, a distribution of NDB Cluster developed and used by Hopsworks AB. It also contains development branches of RonDB.☆614Updated this week
- An extensible distributed system for reliable nearline data streaming at scale☆934Updated 10 months ago
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.☆1,277Updated this week
- ClickBench: a Benchmark For Analytical Databases☆768Updated this week
- Open Control Plane for Tables in Data Lakehouse☆333Updated this week
- Apache DataFusion Ballista Distributed Query Engine☆1,695Updated last week
- GlareDB: An analytics DBMS for distributed data☆777Updated this week
- Remote shuffle service for Apache Spark to store shuffle data on remote servers.☆326Updated last year
- Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.☆1,298Updated this week
- Oxia - Metadata store and coordination system☆242Updated this week
- ☆610Updated 2 years ago
- An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.☆424Updated 3 years ago
- The gateway component to make Spark on K8s much easier for Spark users.☆187Updated last month
- A collection of RBIR projects and posts for anyone interested in joining this journey.☆230Updated this week
- A library that provides an embeddable, persistent key-value store for fast storage optimized for AWS☆787Updated 2 months ago
- Distributed SQL Query Engine in Python using Ray☆243Updated 5 months ago
- 🌊 Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. 🌊☆700Updated this week
- CMU-DB's Cascades optimizer framework☆396Updated 2 months ago
- Waltz is a quorum-based distributed write-ahead log for replicating transactions☆415Updated 2 years ago
- A RocksDB compliant high performance scalable embedded key-value store☆965Updated 9 months ago
- A load balancer / proxy / gateway for prestodb☆357Updated 8 months ago
- A Relational Database Backed by Apache Kafka☆389Updated last week
- A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…☆299Updated last year
- Apache Polaris, the interoperable, open source catalog for Apache Iceberg☆1,398Updated this week