apache / datafusionLinks
Apache DataFusion SQL Query Engine
☆7,709Updated last week
Alternatives and similar repositories for datafusion
Users that are interested in datafusion are comparing it to the libraries listed below
Sorting:
- Official Rust implementation of Apache Arrow☆3,121Updated last week
- Apache DataFusion Ballista Distributed Query Engine☆1,841Updated last week
- Real-time event streaming platform. Streaming CDC, stream processing, low-latency serving, and Iceberg management.☆8,340Updated this week
- 𝗔𝗜-𝗡𝗮𝘁𝗶𝘃𝗲 𝗗𝗮𝘁𝗮 𝗪𝗮𝗿𝗲𝗵𝗼𝘂𝘀𝗲. Open-source Snowflake alternative. Proven at petabyte scale with enterprise performance. B…☆8,822Updated this week
- Extensible SQL Lexer and Parser for Rust☆3,184Updated this week
- Real-time Data Integration and Transformation: use SQL to transform, deliver, and act on fast-changing data.☆6,109Updated this week
- Distributed stream processing engine in Rust☆4,518Updated last week
- A native Rust library for Delta Lake, with bindings into Python☆2,946Updated this week
- Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, v…☆5,363Updated this week
- Apache OpenDAL: One Layer, All Storage.☆4,423Updated this week
- A composable and fully extensible C++ execution engine library for data management systems.☆3,889Updated this week
- Open-source, cloud-native, unified observability database for metrics, logs and traces, supporting SQL/PromQL/Streaming. Available on Gre…☆5,527Updated this week
- A modular implementation of timely dataflow in Rust☆3,504Updated 3 weeks ago
- Apache Iceberg☆1,081Updated this week
- A new arguably faster implementation of Apache Spark from scratch in Rust☆2,239Updated 3 years ago
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.☆1,381Updated this week
- Distributed query engine providing simple and reliable data processing for any modality and scale☆4,384Updated this week
- The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query process…☆1,587Updated this week
- Apache HoraeDB (incubating) is a high-performance, distributed, cloud native time-series database.☆2,782Updated 2 weeks ago
- Build Postgres Extensions with Rust!☆4,150Updated 3 weeks ago
- TensorBase is a new big data warehousing with modern efforts.☆1,453Updated 3 years ago
- Transmute-free Rust library to work with the Arrow format☆1,063Updated last year
- Apache DataFusion Comet Spark Accelerator☆1,037Updated this week
- An implementation of differential dataflow using timely dataflow on Rust.☆2,805Updated this week
- An extensible, state of the art columnar file format. Formerly at @spiraldb, now an Incubation Stage project at LFAI&Data, part of the Li…☆1,736Updated this week
- DuckLake is an integrated data lake and catalog format☆2,037Updated this week
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics☆15,964Updated this week
- A cloud native embedded storage engine built on object storage.☆2,282Updated last week
- Apache Iggy: Hyper-Efficient Message Streaming at Laser Speed☆2,893Updated last week
- GlueSQL is quite sticky. It attaches to anywhere.☆2,936Updated this week