apache/arrow

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/apache/arrow)

apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics

☆16,971

Alternatives and similar repositories for arrow

Users that are interested in arrow are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

apache / datafusion
View on GitHub
Apache DataFusion SQL Query Engine
☆9,076Updated this week
duckdb / duckdb
View on GitHub
DuckDB is an analytical in-process SQL database management system
☆39,913Updated this week
apache / arrow-rs
View on GitHub
Official Rust implementation of Apache Arrow
☆3,558Updated this week
facebookincubator / velox
View on GitHub
A composable and fully extensible C++ execution engine library for data management systems.
☆4,183Updated this week
pola-rs / polars
View on GitHub
Extremely fast Query Engine for DataFrames, written in Rust
☆39,157Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ClickHouse / ClickHouse
View on GitHub
ClickHouse® is a real-time analytics database management system
☆49,004Updated this week
apache / iceberg
View on GitHub
Apache Iceberg
☆9,103Updated this week
dask / dask
View on GitHub
Parallel computing with task scheduling
☆13,878Updated this week
facebook / rocksdb
View on GitHub
A library that provides an embeddable, persistent key-value store for fast storage.
☆31,930Updated this week
ray-project / ray
View on GitHub
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
☆43,414Updated this week
delta-io / delta
View on GitHub
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…
☆8,926Updated this week
databendlabs / databend
View on GitHub
Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. — rebuilt from scratch. Unified architecture on your S3.
☆9,405Updated this week
trinodb / trino
View on GitHub
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
☆13,086Updated this week
substrait-io / substrait
View on GitHub
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
☆1,541Updated this week
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
lance-format / lance
View on GitHub
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data ve…
☆6,897Updated this week
apache / airflow
View on GitHub
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
☆46,354Updated this week
prestodb / presto
View on GitHub
The official home of the Presto distributed SQL query engine for big data
☆16,715Updated this week
MaterializeInc / materialize
View on GitHub
The live data layer for apps and AI agents. Create up-to-the-second views into your business, just using SQL
☆6,344Updated this week
apache / parquet-java
View on GitHub
Apache Parquet Java
☆3,071Updated this week
tikv / tikv
View on GitHub
Distributed transactional key-value database, originally created to complement TiDB
☆16,784Updated this week
apache / spark
View on GitHub
Apache Spark - A unified analytics engine for large-scale data processing
☆43,765Updated this week
apache / parquet-format
View on GitHub
Apache Parquet Format
☆2,511Updated this week
apache / datafusion-ballista
View on GitHub
Apache DataFusion Ballista Distributed Query Engine
☆2,097Updated this week
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
ibis-project / ibis
View on GitHub
the portable Python dataframe library
☆6,618Updated this week
rapidsai / cudf
View on GitHub
cuDF - GPU DataFrame Library
☆9,722Updated this week
risingwavelabs / risingwave
View on GitHub
Event streaming platform for agentic AI. Continuously ingest, transform, and serve event streams in real time, at scale.
☆9,200Updated this week
apache / superset
View on GitHub
Apache Superset is a Data Visualization and Data Exploration Platform
☆74,104Updated this week
scylladb / scylladb
View on GitHub
NoSQL data store using the Seastar framework, compatible with Apache Cassandra and Amazon DynamoDB
☆15,683Updated this week
apache / druid
View on GitHub
Apache Druid: a high performance real-time analytics database.
☆14,036Updated this week
google / flatbuffers
View on GitHub
FlatBuffers: Memory Efficient Serialization Library
☆26,284Jun 22, 2026Updated last month
apache / beam
View on GitHub
Apache Beam is a unified programming model for Batch and Streaming data processing.
☆8,638Updated this week
apple / foundationdb
View on GitHub
FoundationDB - the open source, distributed, transactional key-value store
☆16,570Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
apache / calcite
View on GitHub
Apache Calcite
☆5,163Updated this week
apache / flink
View on GitHub
Apache Flink
☆26,234Updated this week
weld-project / weld
View on GitHub
High-performance runtime for data analytics applications
☆3,007Apr 13, 2026Updated 3 months ago
apache / pinot
View on GitHub
Apache Pinot - A realtime distributed OLAP datastore
☆6,118Updated this week
timescale / timescaledb
View on GitHub
A time-series database for high-performance real-time analytics packaged as a Postgres extension
☆23,215Updated this week
cockroachdb / cockroach
View on GitHub
CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placemen…
☆32,343Updated this week
apache / doris
View on GitHub
Apache Doris is a real-time analytics and hybrid search database for AI agents.
☆15,703Updated this week