pydiverse / pydiverse.pipedag
A data pipeline orchestration library for rapid iterative development with automatic cache invalidation allowing users to focus writing their tasks in pandas, polars, sqlalchemy, ibis, and alike.
☆30Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for pydiverse.pipedag
- Python bindings and arrow integration for the rust object_store crate.☆57Updated 3 months ago
- Coming soon☆58Updated last year
- ☆86Updated this week
- ☆34Updated this week
- Assessing whether data from database complies with reference information.☆42Updated this week
- Automated, schema-based JSON unpacking to Polars objects☆14Updated 8 months ago
- Arrow, pydantic style☆82Updated last year
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆26Updated this week
- Minimal plugin loading package for polars with optional typegen☆13Updated 3 weeks ago
- Polars plugin for stable hashing functionality☆57Updated 3 weeks ago
- general functions for your data .pipe()-lines.☆16Updated last year
- Cluster tools for running Dask on Databricks☆13Updated 5 months ago
- Time based splits for cross validation☆33Updated last week
- An Python object protocol for projects to interchange data frame-like data without forcing pandas.DataFrame as the intermediary☆16Updated 4 years ago
- RFC document, tooling and other content related to the dataframe API standard☆102Updated 7 months ago
- Extremely lightweight compatibility layer between pandas and Polars☆41Updated 6 months ago
- Declarative layer for your database.☆38Updated last year
- A place to provide Coiled feedback☆14Updated 4 months ago
- Dask integration for Snowflake☆30Updated last week
- Identifiers and Standard Format Parsing for Polars Dataframe☆15Updated 4 months ago
- A Zoo for decorators☆25Updated 3 weeks ago
- A toolbox 🧰 for Jupyter notebooks 📙: testing, experiment tracking, debugging, profiling, and more!☆67Updated 2 months ago
- Tools for making Prefect work better for typical data science workflows☆19Updated 2 years ago
- Distributed Task Queue based Dask☆36Updated last year
- Automated Jupyter notebook testing. 📙☆41Updated 9 months ago
- ☆40Updated last year
- IbisML is a library for building scalable ML pipelines using Ibis.☆95Updated last month
- Feature engineering library that helps you keep track of feature dependencies, documentation and schema☆28Updated 2 years ago
- Fast, resilient and reproducible data analysis with cached SQL queries☆30Updated last year