apache / arrow
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
☆14,615Updated this week
Related projects ⓘ
Alternatives and complementary repositories for arrow
- Apache DataFusion SQL Query Engine☆6,312Updated this week
- A composable and fully extensible C++ execution engine library for data management systems.☆3,520Updated this week
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆7,608Updated this week
- High-performance runtime for data analytics applications☆2,996Updated 2 years ago
- A library that provides an embeddable, persistent key-value store for fast storage.☆28,670Updated this week
- Apache Parquet Format☆1,810Updated last week
- Apache Parquet Java☆2,642Updated this week
- DuckDB is an analytical in-process SQL database management system☆24,380Updated this week
- The official home of the Presto distributed SQL query engine for big data☆16,059Updated this week
- HeavyDB (formerly OmniSciDB)☆2,956Updated 2 months ago
- Parallel computing with task scheduling☆12,604Updated this week
- the portable Python dataframe library☆5,318Updated this week
- Apache Airflow - A platform to programmatically author, schedule, and monitor workflows☆37,168Updated this week
- Apache Pinot - A realtime distributed OLAP datastore☆5,517Updated this week
- Apache Druid: a high performance real-time analytics database.☆13,514Updated this week
- Apache Iceberg☆6,473Updated this week
- The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.☆5,809Updated this week
- Dataframes powered by a multithreaded, vectorized query engine, written in Rust☆30,435Updated this week
- The Universal Storage Engine☆1,868Updated this week
- Data-Centric Pipelines and Data Versioning☆6,181Updated this week
- BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.☆1,934Updated 2 years ago
- Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, v…☆3,964Updated this week
- Distributed transactional key-value database, originally created to complement TiDB☆15,291Updated this week
- An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.☆17,907Updated this week
- 𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://data…☆7,867Updated this week
- Apache Beam is a unified programming model for Batch and Streaming data processing.☆7,882Updated this week
- ZetaSQL - Analyzer Framework for SQL☆2,325Updated last week
- NoSQL data store using the seastar framework, compatible with Apache Cassandra☆13,598Updated this week