apache / datafusion-comet
Apache DataFusion Comet Spark Accelerator
☆922Updated this week
Alternatives and similar repositories for datafusion-comet:
Users that are interested in datafusion-comet are comparing it to the libraries listed below
- Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.☆1,431Updated this week
- Apache Iceberg☆871Updated this week
- Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.☆526Updated this week
- Apache DataFusion Ballista Distributed Query Engine☆1,699Updated this week
- Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.☆1,307Updated this week
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.☆1,277Updated last week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,166Updated this week
- The native Rust implementation for Apache Hudi, with Python API bindings.☆206Updated this week
- Apache PyIceberg☆663Updated this week
- A native Delta implementation for integration with any query engine☆207Updated this week
- Apache Polaris, the interoperable, open source catalog for Apache Iceberg☆1,398Updated this week
- A highly efficient daemon for streaming data from Kafka into Delta Lake☆393Updated 3 weeks ago
- New file format for storage of large columnar datasets.☆497Updated 2 weeks ago
- Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.☆829Updated last month
- LakeSail's computation framework with a mission to unify batch processing, stream processing, and compute-intensive (AI) workloads.☆698Updated last week
- Apache DataFusion Ray☆180Updated 3 weeks ago
- Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.☆938Updated this week
- Open Control Plane for Tables in Data Lakehouse☆336Updated this week
- A collection of RBIR projects and posts for anyone interested in joining this journey.☆230Updated this week
- ☆244Updated this week
- Apache DataFusion Python Bindings☆429Updated last week
- Database connectivity API standard and libraries for Apache Arrow☆424Updated this week
- Rust implementation of Apache Iceberg with integration for Datafusion☆157Updated this week
- Uniffle is a high performance, general purpose Remote Shuffle Service.☆412Updated last week
- This is the companion repository for the book How Query Engines Work.☆384Updated last year
- Distributed SQL Query Engine in Python using Ray☆243Updated 6 months ago
- Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.☆942Updated this week
- Performance Observability for Apache Spark☆239Updated last week
- ☆189Updated 2 weeks ago
- Remote shuffle service for Apache Spark to store shuffle data on remote servers.☆326Updated last year