apache / datafusion
Apache DataFusion SQL Query Engine
โ6,628Updated this week
Alternatives and similar repositories for datafusion:
Users that are interested in datafusion are comparing it to the libraries listed below
- Official Rust implementation of Apache Arrowโ2,704Updated this week
- Apache DataFusion Ballista Distributed Query Engineโ1,618Updated this week
- ๐๐ฎ๐๐ฎ, ๐๐ป๐ฎ๐น๐๐๐ถ๐ฐ๐ & ๐๐. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://dataโฆโ8,095Updated this week
- Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time Eโฆโ7,254Updated this week
- The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.โ5,859Updated this week
- Distributed stream processing engine in Rustโ3,908Updated this week
- Extensible SQL Lexer and Parser for Rustโ2,880Updated this week
- A native Rust library for Delta Lake, with bindings into Pythonโ2,483Updated this week
- Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vโฆโ4,119Updated this week
- Apache OpenDAL: One Layer, All Storage.โ3,661Updated this week
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.โ1,238Updated this week
- A modular implementation of timely dataflow in Rustโ3,336Updated this week
- A composable and fully extensible C++ execution engine library for data management systems.โ3,586Updated this week
- TensorBase is a new big data warehousing with modern efforts.โ1,442Updated 2 years ago
- A new arguably faster implementation of Apache Spark from scratch in Rustโ2,235Updated 2 years ago
- Apache Icebergโ778Updated this week
- An implementation of differential dataflow using timely dataflow on Rust.โ2,618Updated last week
- Build Postgres Extensions with Rust!โ3,785Updated 3 weeks ago
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analyticsโ14,846Updated this week
- An open-source, cloud-native, unified time series database for metrics, logs and events, supporting SQL/PromQL/Streaming. Available on Grโฆโ4,587Updated this week
- Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.โ1,373Updated this week
- Transmute-free Rust library to work with the Arrow formatโ1,063Updated 10 months ago
- Apache HoraeDB (incubating) is a high-performance, distributed, cloud native time-series database.โ2,690Updated this week
- Apache DataFusion Comet Spark Acceleratorโ866Updated this week
- High-performance runtime for data analytics applicationsโ2,993Updated 2 years ago
- An educational OLAP database system.โ1,663Updated this week
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trโฆโ7,755Updated this week
- Skytable is a modern scalable NoSQL database with BlueQL, designed for performance, scalability and flexibility. Skytable gives you spaceโฆโ2,489Updated this week