Eventual-Inc/Daft

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Eventual-Inc/Daft)

Eventual-Inc / Daft

High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale

☆5,654

Alternatives and similar repositories for Daft

Users that are interested in Daft are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lance-format / lance
View on GitHub
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data ve…
☆6,839Updated this week
vortex-data / vortex
View on GitHub
An extensible, state-of-the-art framework for columnar compression, and the fastest FOSS columnar file format. Formerly at @spiraldb, now…
☆3,094Updated this week
apache / datafusion
View on GitHub
Apache DataFusion SQL Query Engine
☆9,010Updated this week
lancedb / lancedb
View on GitHub
Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.
☆10,969Updated this week
apache / iceberg-rust
View on GitHub
Apache Iceberg
☆1,350Updated this week
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
ArroyoSystems / arroyo
View on GitHub
Distributed stream processing engine in Rust
☆4,974Updated this week
apache / datafusion-comet
View on GitHub
Apache DataFusion Comet Spark Accelerator
☆1,231Updated this week
risingwavelabs / risingwave
View on GitHub
Event streaming platform for agentic AI. Continuously ingest, transform, and serve event streams in real time, at scale.
☆9,189Updated this week
apache / datafusion-ballista
View on GitHub
Apache DataFusion Ballista Distributed Query Engine
☆2,095Updated this week
delta-io / delta-rs
View on GitHub
A native Rust library for Delta Lake, with bindings into Python
☆3,267Updated this week
pola-rs / polars
View on GitHub
Extremely fast Query Engine for DataFrames, written in Rust
☆39,080Updated this week
databendlabs / databend
View on GitHub
Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. — rebuilt from scratch. Unified architecture on your S3.
☆9,392Updated this week
lakehq / sail
View on GitHub
Drop-in Apache Spark replacement written in Rust, unifying batch processing, stream processing, and compute-intensive AI workloads.
☆3,207Updated this week
duckdb / ducklake
View on GitHub
DuckLake is an integrated data lake and catalog format
☆2,882Updated this week
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
ibis-project / ibis
View on GitHub
the portable Python dataframe library
☆6,605Updated this week
slatedb / slatedb
View on GitHub
A cloud native embedded storage engine built on object storage.
☆3,218Updated this week
lakekeeper / lakekeeper
View on GitHub
Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.
☆1,400Updated this week
fugue-project / fugue
View on GitHub
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…
☆2,170May 19, 2026Updated 2 months ago
duckdb / duckdb
View on GitHub
DuckDB is an analytical in-process SQL database management system
☆39,653Updated this week
apache / auron
View on GitHub
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query process…
☆1,780Updated this week
facebookincubator / velox
View on GitHub
A composable and fully extensible C++ execution engine library for data management systems.
☆4,178Updated this week
apache / opendal
View on GitHub
Apache OpenDAL: One Layer, All Storage.
☆5,253Updated this week
deepseek-ai / smallpond
View on GitHub
A lightweight data processing framework built on DuckDB and 3FS.
☆4,971Mar 5, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
apache / gluten
View on GitHub
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
☆1,576Updated this week
GlareDB / glaredb
View on GitHub
GlareDB: A light and fast SQL database for analytics
☆1,018Nov 14, 2025Updated 8 months ago
apache / iggy
View on GitHub
Apache Iggy: Hyper-Efficient Message Streaming at Laser Speed
☆4,447Updated this week
MaterializeInc / materialize
View on GitHub
The live data layer for apps and AI agents. Create up-to-the-second views into your business, just using SQL
☆6,336Updated this week
apache / iceberg
View on GitHub
Apache Iceberg
☆9,075Updated this week
SQLMesh / sqlmesh
View on GitHub
Scalable and efficient data transformation framework - backwards compatible with dbt.
☆3,214Updated this week
paradedb / paradedb
View on GitHub
One Postgres for your application data, full-text search, vector retrieval, and aggregations. Home of the pg_search extension.
☆9,075Updated this week
dlt-hub / dlt
View on GitHub
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
☆5,646Updated this week
apache / arrow-rs
View on GitHub
Official Rust implementation of Apache Arrow
☆3,534Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
bytewax / bytewax
View on GitHub
Python Stream Processing
☆2,037Jun 20, 2026Updated last month
ray-project / deltacat
View on GitHub
A portable Multimodal Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to you…
☆282Apr 17, 2026Updated 3 months ago
dagster-io / dagster
View on GitHub
An orchestration platform for the development, production, and observation of data assets.
☆15,883Updated this week
ray-project / ray
View on GitHub
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
☆43,324Updated this week
tobymao / sqlglot
View on GitHub
Python SQL Parser and Transpiler
☆9,454Updated this week
unitycatalog / unitycatalog
View on GitHub
Open, Multi-modal Catalog for Data & AI
☆3,465Updated this week
apache / arrow
View on GitHub
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
☆16,947Updated this week