pydiverse / pydiverse.pipedag
A data pipeline orchestration library for rapid iterative development with automatic cache invalidation allowing users to focus writing their tasks in pandas, polars, sqlalchemy, ibis, and alike.
☆29Updated last week
Related projects ⓘ
Alternatives and complementary repositories for pydiverse.pipedag
- ☆86Updated this week
- Cluster tools for running Dask on Databricks☆13Updated 5 months ago
- Python bindings and arrow integration for the rust object_store crate.☆56Updated 3 months ago
- Coming soon☆58Updated last year
- Tools for making Prefect work better for typical data science workflows☆19Updated 2 years ago
- Extremely lightweight compatibility layer between pandas and Polars☆40Updated 6 months ago
- Arrow, pydantic style☆82Updated last year
- LETSQL is a deferred compute system focused on Preprocessing for AI pipelines. Optimize performance with cross-engine caching and static …☆67Updated this week
- Assessing whether data from database complies with reference information.☆42Updated this week
- A repository of runnable examples using ibis☆40Updated 4 months ago
- IbisML is a library for building scalable ML pipelines using Ibis.☆93Updated last month
- Dask integration for Snowflake☆30Updated 4 months ago
- Polars plugin for stable hashing functionality☆57Updated last week
- A toolbox 🧰 for Jupyter notebooks 📙: testing, experiment tracking, debugging, profiling, and more!☆67Updated last month
- Feature engineering library that helps you keep track of feature dependencies, documentation and schema☆28Updated 2 years ago
- Simplifying conditional Polars Expressions with Python 🐍 🐻❄️☆100Updated last week
- ☆32Updated this week
- Fast, resilient and reproducible data analysis with cached SQL queries☆30Updated last year
- Minimal plugin loading package for polars with optional typegen☆13Updated last week
- general functions for your data .pipe()-lines.☆16Updated last year
- Plugins, extensions, case studies, articles, and video tutorials for Kedro☆63Updated last month
- SQLAlchemy dialect for Turbodbc☆23Updated 5 months ago
- Unified Distributed Execution☆49Updated 3 weeks ago
- An abstraction layer for parameter tuning☆36Updated 2 months ago
- Streaming and approximate algorithms. WIP, use at own risk.☆24Updated 2 months ago
- Time based splits for cross validation☆31Updated this week
- Declarative layer for your database.☆39Updated last year
- A simple key/value store built on Prefect☆22Updated last year
- Distributed Task Queue based Dask☆36Updated last year