apache / arrowLinks
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
☆16,054Updated this week
Alternatives and similar repositories for arrow
Users that are interested in arrow are comparing it to the libraries listed below
Sorting:
- Apache DataFusion SQL Query Engine☆7,866Updated this week
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆8,328Updated this week
- A composable and fully extensible C++ execution engine library for data management systems.☆3,913Updated this week
- Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, v…☆5,462Updated last week
- DuckDB is an analytical in-process SQL database management system☆33,352Updated this week
- Parallel computing with task scheduling☆13,518Updated last week
- Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)☆12,002Updated this week
- Extremely fast Query Engine for DataFrames, written in Rust☆35,673Updated this week
- Apache Parquet Java☆2,957Updated last week
- Apache Iceberg☆8,066Updated this week
- the portable Python dataframe library☆6,149Updated this week
- Apache Pinot - A realtime distributed OLAP datastore☆5,921Updated this week
- The official home of the Presto distributed SQL query engine for big data☆16,519Updated this week
- Apache Airflow - A platform to programmatically author, schedule, and monitor workflows☆42,764Updated this week
- 𝗔𝗜-𝗡𝗮𝘁𝗶𝘃𝗲 𝗗𝗮𝘁𝗮 𝗪𝗮𝗿𝗲𝗵𝗼𝘂𝘀𝗲. Open-source Snowflake alternative. Proven at petabyte scale with enterprise performance. B…☆8,917Updated this week
- High-performance runtime for data analytics applications☆3,000Updated 3 years ago
- Apache Parquet Format☆2,063Updated last week
- Apache Beam is a unified programming model for Batch and Streaming data processing.☆8,330Updated this week
- NoSQL data store using the Seastar framework, compatible with Apache Cassandra and Amazon DynamoDB☆14,929Updated this week
- Real-time Data Integration and Transformation: use SQL to transform, deliver, and act on fast-changing data.☆6,135Updated this week
- Real-time event streaming platform. Streaming CDC, stream processing, low-latency serving, and Iceberg management.☆8,424Updated this week
- A time-series database for high-performance real-time analytics packaged as a Postgres extension☆20,397Updated this week
- dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build application…☆11,698Updated this week
- The Universal Storage Engine☆1,987Updated this week
- A library that provides an embeddable, persistent key-value store for fast storage.☆30,780Updated this week
- cuDF - GPU DataFrame Library☆9,239Updated this week
- Official Rust implementation of Apache Arrow☆3,180Updated this week
- Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!☆11,114Updated this week
- A modular implementation of timely dataflow in Rust☆3,520Updated 2 weeks ago
- FoundationDB - the open source, distributed, transactional key-value store☆15,756Updated this week