apache / datafusion-comet
Apache DataFusion Comet Spark Accelerator
☆745Updated last week
Related projects: ⓘ
- Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.☆1,126Updated this week
- Apache Iceberg☆598Updated this week
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.☆1,167Updated this week
- Apache DataFusion Ballista Distributed Query Engine☆1,468Updated this week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆984Updated this week
- Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.☆1,143Updated this week
- A highly efficient daemon for streaming data from Kafka into Delta Lake☆354Updated last week
- Apache PyIceberg☆385Updated this week
- Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.☆780Updated 2 weeks ago
- New file format for storage of large columnar datasets.☆421Updated last week
- A Rust implementation of the Iceberg REST Catalog specification.☆145Updated this week
- The interoperable, open source catalog for Apache Iceberg☆1,012Updated this week
- A native Rust library for Apache Hudi, with bindings into Python☆137Updated this week
- A native Delta implementation for integration with any query engine☆114Updated last week
- Database connectivity API standard and libraries for Apache Arrow☆360Updated this week
- Apache DataFusion Python Bindings☆346Updated last week
- ☆197Updated last month
- Distributed SQL Query Engine in Python using Ray☆230Updated 9 months ago
- A load balancer / proxy / gateway for prestodb☆356Updated last month
- Remote shuffle service for Apache Spark to store shuffle data on remote servers.☆321Updated 11 months ago
- Open Control Plane for Tables in Data Lakehouse☆289Updated this week
- This is the companion repository for the book How Query Engines Work.☆356Updated last year
- The Feldera Incremental Computation Engine☆366Updated this week
- Performance Observability for Apache Spark☆163Updated last week
- Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg☆297Updated last year
- A native Rust library for Delta Lake, with bindings into Python☆2,155Updated this week
- This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spa…☆692Updated last month
- ☆375Updated this week
- A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.☆341Updated 3 months ago
- ☆232Updated this week