Eventual-Inc / Daft
Distributed data engine for Python/SQL designed for the cloud, powered by Rust
☆2,306Updated this week
Related projects ⓘ
Alternatives and complementary repositories for Daft
- Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, v…☆3,929Updated this week
- Making data lake work for time series☆1,136Updated 2 months ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,003Updated last month
- Fastest library to load data from DB to DataFrames in Rust and Python☆1,995Updated this week
- Python Stream Processing☆1,543Updated this week
- Efficient data transformation and modeling framework that is backwards compatible with dbt.☆1,785Updated this week
- A native Rust library for Delta Lake, with bindings into Python☆2,308Updated this week
- An extensible, state-of-the-art columnar file format☆967Updated this week
- Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metada…☆1,848Updated this week
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.☆1,199Updated this week
- Apache DataFusion Ballista Distributed Query Engine☆1,531Updated this week
- Chronon is a data platform for serving for AI/ML applications.☆739Updated this week
- 🦆 A curated list of awesome DuckDB resources☆1,353Updated this week
- data load tool (dlt) is an open source Python library that makes data loading easy 🛠️☆2,587Updated this week
- Malloy is an experimental language for describing data relationships and transformations.☆1,990Updated this week
- Apache DataFusion Comet Spark Accelerator☆816Updated this week
- dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)☆916Updated this week
- MetricFlow allows you to define, build, and maintain metrics in code.☆1,143Updated this week
- GlareDB: An analytics DBMS for distributed data☆678Updated last week
- A light-weight, flexible, and expressive statistical data testing library☆3,364Updated last week
- An open-source ML pipeline development platform☆973Updated 3 weeks ago
- A curated list of Polars talks, tools, examples & articles. Contributions welcome !☆750Updated this week
- WebAssembly version of DuckDB☆1,271Updated this week
- the portable Python dataframe library☆5,267Updated this week
- Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.☆1,707Updated this week
- The Virtual Feature Store. Turn your existing data infrastructure into a feature store.☆1,815Updated this week
- New file format for storage of large columnar datasets.☆449Updated this week
- Apache DataFusion Python Bindings☆373Updated this week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,031Updated this week
- Embeddable property graph database management system built for query speed and scalability. Implements Cypher.☆1,388Updated this week