apache / arrow
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
☆14,846Updated this week
Alternatives and similar repositories for arrow:
Users that are interested in arrow are comparing it to the libraries listed below
- Apache DataFusion SQL Query Engine☆6,628Updated this week
- A composable and fully extensible C++ execution engine library for data management systems.☆3,586Updated this week
- High-performance runtime for data analytics applications☆2,993Updated 2 years ago
- Apache Parquet Java☆2,686Updated this week
- Parallel computing with task scheduling☆12,851Updated this week
- DuckDB is an analytical in-process SQL database management system☆25,834Updated this week
- The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.☆5,859Updated this week
- 𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://data…☆8,095Updated this week
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆7,755Updated this week
- Distributed transactional key-value database, originally created to complement TiDB☆15,462Updated this week
- The official home of the Presto distributed SQL query engine for big data☆16,155Updated this week
- Apache Parquet Format☆1,851Updated this week
- Apache Airflow - A platform to programmatically author, schedule, and monitor workflows☆38,298Updated this week
- high-performance graph database for real-time use cases☆20,582Updated this week
- A new arguably faster implementation of Apache Spark from scratch in Rust☆2,235Updated 2 years ago
- BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.☆1,941Updated 2 years ago
- Apache Iceberg☆6,767Updated this week
- The Universal Storage Engine☆1,887Updated this week
- the portable Python dataframe library☆5,451Updated this week
- Dataframes powered by a multithreaded, vectorized query engine, written in Rust☆31,399Updated this week
- A modular implementation of timely dataflow in Rust☆3,336Updated this week
- Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time E…☆7,254Updated this week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.☆34,903Updated this week
- Apache Druid: a high performance real-time analytics database.☆13,593Updated this week
- ClickHouse® is a real-time analytics database management system☆38,472Updated this week
- Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, v…☆4,119Updated this week
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.☆1,238Updated this week
- Graphs for Everyone☆13,673Updated last week
- cuDF - GPU DataFrame Library☆8,597Updated this week