Eventual-Inc / Daft
Distributed data engine for Python/SQL designed for the cloud, powered by Rust
☆2,397Updated this week
Alternatives and similar repositories for Daft:
Users that are interested in Daft are comparing it to the libraries listed below
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,012Updated 2 months ago
- Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, v…☆3,987Updated last week
- Making data lake work for time series☆1,141Updated 3 months ago
- Python Stream Processing☆1,579Updated this week
- Fastest library to load data from DB to DataFrames in Rust and Python☆2,026Updated 2 weeks ago
- A native Rust library for Delta Lake, with bindings into Python☆2,360Updated this week
- Efficient data transformation and modeling framework that is backwards compatible with dbt.☆1,853Updated this week
- GlareDB: An analytics DBMS for distributed data☆721Updated this week
- Apache DataFusion Ballista Distributed Query Engine☆1,562Updated this week
- Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metada…☆1,885Updated this week
- data load tool (dlt) is an open source Python library that makes data loading easy 🛠️☆2,748Updated this week
- Malloy is an experimental language for describing data relationships and transformations.☆2,005Updated this week
- 🦆 A curated list of awesome DuckDB resources☆1,403Updated last week
- Apache DataFusion Python Bindings☆380Updated this week
- An extensible, state-of-the-art columnar file format☆1,014Updated this week
- Chronon is a data platform for serving for AI/ML applications.☆746Updated this week
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.☆1,213Updated last week
- dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)☆935Updated last week
- LakeSail's computation framework with a mission to unify stream processing, batch processing, and compute-intensive (AI) workloads.☆556Updated this week
- the portable Python dataframe library☆5,354Updated this week
- Distributed stream processing engine in Rust☆3,818Updated last week
- A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton☆863Updated last year
- Apache DataFusion Comet Spark Accelerator☆829Updated this week
- Apache PyIceberg☆490Updated this week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,049Updated this week
- WebAssembly version of DuckDB☆1,314Updated last week
- Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to wr…☆1,857Updated this week
- A cloud native embedded storage engine built on object storage.☆1,576Updated this week
- 🌊 Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. 🌊☆644Updated this week
- The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️☆3,516Updated 2 months ago