larribas / dagger
Define sophisticated data pipelines with Python and run them on different distributed systems (such as Argo Workflows).
☆17Updated 11 months ago
Alternatives and similar repositories for dagger:
Users that are interested in dagger are comparing it to the libraries listed below
- Cache the intermediate results of queries on timeseries data in DataFusion.☆19Updated 6 months ago
- Back-end implementation of the Open Data Fabric protocol☆15Updated last week
- Official Python client SDK for Iggy.rs message streaming.☆25Updated 2 months ago
- Apache Arrow Development Experiments☆20Updated 3 months ago
- Journeys between the two worlds of Python 🐍 and Rust 🦀☆40Updated this week
- Robust data transformation tool using SQL☆21Updated 2 years ago
- Arrow, pydantic style☆82Updated 2 years ago
- Sample code to accompany blog post showcasing Arrow Flight SQL running on DuckDB☆33Updated 2 years ago
- Simple Workflow Framework - Hamilton + APScheduler = FlowerPower☆17Updated this week
- Slipstream provides a data-flow model to simplify development of stateful streaming applications.☆36Updated 2 weeks ago
- Delta reader for the Ray open-source toolkit for building ML applications☆46Updated last year
- A collection of self-contained fsspec-based filesystems☆16Updated this week
- 🦀 Online statistics in Rust☆64Updated last month
- Python bindings and arrow integration for the rust object_store crate.☆64Updated 9 months ago
- Hyprstream: Real-time Time-series and High-Performance Cache for Apache Arrow and DuckDB☆28Updated 3 months ago
- A fast bloom filter implemented by Rust for Python! 10x faster than pybloom!☆96Updated last year
- A minimal Python library for Apache Arrow, connecting to the Rust arrow crate☆143Updated 3 weeks ago
- Tantivy directory implementation backed by object_store☆33Updated last year
- ☆21Updated last year
- Rust DataFusion Server☆16Updated last week
- Real-time data processing/feature engineering in Python. Tailored for modern AI/ML systems.☆57Updated this week
- A Kubernetes operator for managing Prefect servers and work pools☆13Updated this week
- Framework to build data pipelines declaratively☆50Updated this week
- The Postgres adapter for Harlequin, the SQL IDE for your Terminal☆15Updated 2 months ago
- Inspect Your Servers with DuckDB☆30Updated 2 years ago
- Rapid fuzzy string matching in Rust using various string metrics☆53Updated 10 months ago
- A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.☆24Updated last year
- A cli for spinning up and managing Ray clusters for the Daft Query Engine.☆11Updated 2 months ago
- Flat files, flat land.☆26Updated this week
- A robust (🐢) and fast (🐇) MLOps tool for managing data and pipelines in Rust (🦀)☆51Updated 2 weeks ago