apache / datafusion-comet
Apache DataFusion Comet Spark Accelerator
☆935Updated this week
Alternatives and similar repositories for datafusion-comet:
Users that are interested in datafusion-comet are comparing it to the libraries listed below
- Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.☆1,448Updated this week
- Apache Iceberg☆894Updated this week
- Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.☆582Updated last week
- Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.☆1,322Updated this week
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.☆1,289Updated this week
- Apache DataFusion Ballista Distributed Query Engine☆1,724Updated last week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,185Updated this week
- Apache PyIceberg☆687Updated this week
- A native Delta implementation for integration with any query engine☆223Updated this week
- Apache Polaris, the interoperable, open source catalog for Apache Iceberg☆1,445Updated this week
- New file format for storage of large columnar datasets.☆532Updated this week
- Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.☆835Updated 2 months ago
- The native Rust implementation for Apache Hudi, with Python API bindings.☆209Updated this week
- Apache DataFusion Ray☆184Updated 3 weeks ago
- A highly efficient daemon for streaming data from Kafka into Delta Lake☆397Updated this week
- ☆256Updated last week
- LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive (AI) workloads.☆727Updated this week
- Apache DataFusion Python Bindings☆441Updated this week
- Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.☆947Updated this week
- A collection of RBIR projects and posts for anyone interested in joining this journey.☆235Updated this week
- Database connectivity API standard and libraries for Apache Arrow☆432Updated this week
- Open Control Plane for Tables in Data Lakehouse☆341Updated 2 weeks ago
- GlareDB: A light and fast SQL database for analytics☆795Updated this week
- Uniffle is a high performance, general purpose Remote Shuffle Service.☆415Updated this week
- Performance Observability for Apache Spark☆248Updated 2 weeks ago
- Rust implementation of Apache Iceberg with integration for Datafusion☆166Updated 2 weeks ago
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆246Updated this week
- ☆193Updated this week
- Distributed SQL Query Engine in Python using Ray☆243Updated 6 months ago
- DuckDB extension for Delta Lake☆176Updated 3 weeks ago