larribas / dagger
Define sophisticated data pipelines with Python and run them on different distributed systems (such as Argo Workflows).
☆17Updated 9 months ago
Alternatives and similar repositories for dagger:
Users that are interested in dagger are comparing it to the libraries listed below
- Apache Arrow Development Experiments☆17Updated 2 months ago
- Rust DataFusion Server☆15Updated this week
- Your go-to for easy access to a plethora of compression algorithms, all neatly bundled in one simple installation.☆101Updated 3 weeks ago
- ☆21Updated 10 months ago
- A collection of self-contained fsspec-based filesystems☆15Updated last week
- Cache the intermediate results of queries on timeseries data in DataFusion.☆18Updated 4 months ago
- The Ultimate BI tool☆7Updated 10 months ago
- Python bindings and arrow integration for the rust object_store crate.☆62Updated 7 months ago
- A library for deserializing a variety of file formats directly into numpy arrays☆29Updated last year
- Apache Arrow Ballista Python bindings☆37Updated last year
- Robust data transformation tool using SQL☆21Updated 2 years ago
- Sample code to accompany blog post showcasing Arrow Flight SQL running on DuckDB☆32Updated 2 years ago
- Arrow, pydantic style☆82Updated 2 years ago
- Coming soon☆60Updated last year
- A robust (🐢) and fast (🐇) MLOps tool for managing data and pipelines in Rust (🦀)☆47Updated this week
- The (B)ig (F)unction (T)axonomy is a detailed reference for common compute functions executed by different libraries, databases, and tool…☆16Updated 3 months ago
- Stateful Dataflows tutorials and examples.☆23Updated this week
- Journeys between the two worlds of Python 🐍 and Rust 🦀☆39Updated this week
- Proof Of Concept for python package management without virtualenvs☆29Updated 11 months ago
- ☆18Updated 2 years ago
- A cli for spinning up and managing Ray clusters for the Daft Query Engine.☆11Updated last month
- An experimental (work-in-progress) statically typed implementation of Apache Arrow☆18Updated this week
- Framework to build data pipelines declaratively☆50Updated last week
- Implementation of Zarr file format in Rust☆11Updated 3 weeks ago
- Antithesis SDK for Rust☆19Updated 2 months ago
- Distributed Task Queue based Dask☆38Updated last year
- A fast bloom filter implemented by Rust for Python! 10x faster than pybloom!☆94Updated last year
- Official Python client SDK for Iggy.rs message streaming.☆24Updated 3 weeks ago
- Package Manager is a JupyterLab extension that simplifies managing Python packages directly within your notebooks☆14Updated last month
- Python binding for DataFusion☆59Updated 2 years ago