capitalone / rubicon-ml
Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!
β127Updated last week
Related projects: β
- IbisML is a library for building scalable ML pipelines using Ibis.β81Updated this week
- A toolbox π§° for Jupyter notebooks π: testing, experiment tracking, debugging, profiling, and more!β60Updated 6 months ago
- Kedro Plugin to support running workflows on Kubeflow Pipelinesβ49Updated 2 weeks ago
- Cloud provider cluster managers for Dask. Supports AWS, Google Cloud Azure and more...β130Updated this week
- Dask integration for Snowflakeβ29Updated 2 months ago
- A Kedro plugin that provides pandas dropin replacements for the pandas datasets (e.g modin and cuDF)β12Updated 3 years ago
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.β78Updated 4 months ago
- β115Updated this week
- β34Updated this week
- DataFrame support for scikit-learn.β63Updated 10 months ago
- An abstraction layer for parameter tuningβ36Updated 2 weeks ago
- Plugins, extensions, case studies, articles, and video tutorials for Kedroβ59Updated 2 months ago
- β107Updated this week
- Kedro-Accelerator speeds up pipelines by parallelizing I/O in the background.β35Updated 2 years ago
- Supporting materials/code examples for my course in data engineering for machine learning.β37Updated last year
- The easiest way to integrate Kedro and Great Expectationsβ52Updated last year
- Automatically export Jupyter notebooks to various file formats (.py, .html, and more) on save.β72Updated 7 months ago
- RFC document, tooling and other content related to the dataframe API standardβ99Updated 5 months ago
- implementation of Cyclic Boosting machine learning algorithmsβ87Updated 2 weeks ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withouβ¦β111Updated 5 months ago
- fsspec-compatible Azure Datake and Azure Blob Storage accessβ175Updated last month
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the sameβ¦β28Updated last year
- Bulwark is a package for convenient property-based testing of pandas dataframes.β223Updated 4 years ago
- ForML - A development framework and MLOps platform for the lifecycle management of data science projectsβ105Updated last year
- Primrose modeling framework for simple production modelsβ34Updated 6 months ago
- Woodwork is a Python library that provides robust methods for managing and communicating data typing information.β145Updated last week
- A kedro plugin that streamlines the integration between Kedro projects and third-party applications, making it easier for you to developβ¦β34Updated 2 months ago
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAMβ82Updated 8 months ago
- Automated Jupyter notebook testing. πβ41Updated 7 months ago
- π Track & manage metadata, visualize & compare Kedro pipelines in a nice UI.β18Updated last month