apache / arrowLinks
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
☆16,407Updated last week
Alternatives and similar repositories for arrow
Users that are interested in arrow are comparing it to the libraries listed below
Sorting:
- Apache DataFusion SQL Query Engine☆8,286Updated this week
- A composable and fully extensible C++ execution engine library for data management systems.☆4,027Updated this week
- The live data layer for apps and AI agents Create up-to-the-second views into your business, just using SQL☆6,220Updated this week
- Parallel computing with task scheduling☆13,723Updated last week
- Event streaming platform for agents, apps, and analytics. Continuously ingest, transform, and serve event data in real time, at scale.☆8,735Updated this week
- Apache Pinot - A realtime distributed OLAP datastore☆6,014Updated this week
- Apache Parquet Java☆3,013Updated this week
- Apache Parquet Format☆2,200Updated last week
- Apache Beam is a unified programming model for Batch and Streaming data processing.☆8,448Updated last week
- Apache Iceberg☆8,461Updated this week
- One Warehouse for Analytics, Search, AI. Snowflake + Elasticsearch + Vector DB — rebuilt from scratch. Unified architecture on your S3.☆9,108Updated this week
- The official home of the Presto distributed SQL query engine for big data☆16,631Updated this week
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆8,552Updated this week
- DuckDB is an analytical in-process SQL database management system☆35,630Updated this week
- NoSQL data store using the Seastar framework, compatible with Apache Cassandra and Amazon DynamoDB☆15,274Updated this week
- Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!☆11,598Updated this week
- High-performance runtime for data analytics applications☆3,004Updated 3 years ago
- FoundationDB - the open source, distributed, transactional key-value store☆16,094Updated this week
- Apache Avro is a data serialization system.☆3,212Updated this week
- Distributed transactional key-value database, originally created to complement TiDB☆16,475Updated this week
- A library that provides an embeddable, persistent key-value store for fast storage.☆31,442Updated this week
- Alluxio, data orchestration for analytics and machine learning in the cloud☆7,143Updated 8 months ago
- the portable Python dataframe library☆6,345Updated last week
- HeavyDB (formerly MapD/OmniSciDB)☆3,055Updated 2 weeks ago
- BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.☆1,999Updated 3 years ago
- Extremely fast Query Engine for DataFrames, written in Rust☆37,057Updated this week
- A modular implementation of timely dataflow in Rust☆3,567Updated last week
- ClickHouse® is a real-time analytics database management system☆45,331Updated this week
- Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)☆12,455Updated this week
- Cap'n Proto serialization/RPC system - core tools and C++ library☆12,793Updated last week