iterative / dvc-dataLinks
DVC's data management subsystem
β18Updated this week
Alternatives and similar repositories for dvc-data
Users that are interested in dvc-data are comparing it to the libraries listed below
Sorting:
- β27Updated 2 years ago
- π·οΈ Git Tag Ops. Turn your Git repository into Artifact Registry or Model Registry.β152Updated last week
- Decorators that logs stats.β113Updated 5 months ago
- π Log and track ML metrics, parameters, models with Git and/or DVCβ179Updated this week
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAMβ84Updated last year
- Run pytest against markdown files/docstrings.β128Updated last month
- Extremely lightweight compatibility layer between pandas and Polarsβ41Updated last year
- Seamlessly integrate numpy arrays into pydantic models.β58Updated 2 years ago
- Benchmarks for DVCβ21Updated this week
- The easiest way to integrate Kedro and Great Expectationsβ54Updated 2 years ago
- Accompanies the uncool MLOps workshopβ26Updated 3 years ago
- π Documentation for Nebariβ16Updated 3 weeks ago
- Runs black on code cells in a Jupyter notebookβ50Updated 3 years ago
- Convert pyproject.toml to environment.yamlβ132Updated 2 years ago
- A mini dashboard to help find slow tests in pytest.β83Updated last year
- simple, flexible, offline capable, cloud storage with a Python path-like interfaceβ174Updated 4 months ago
- A plugin for Flake8 that checks pandas codeβ170Updated 2 years ago
- Feature engineering library that helps you keep track of feature dependencies, documentation and schemaβ28Updated 3 years ago
- π« PyScaffold extension for data-science projectsβ159Updated 2 weeks ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withouβ¦β113Updated last year
- A use case of a reproducible machine learning pipeline using Dask, DVC, and MLflow.β23Updated 6 years ago
- Kedro-Accelerator speeds up pipelines by parallelizing I/O in the background.β36Updated 3 years ago
- Cloud-agnostic Python APIβ60Updated last year
- A proof-of-concept for a RAG to query the scikit-learn documentationβ26Updated last week
- Kedro Plugin to support running workflows on Kubeflow Pipelinesβ54Updated last month
- Test suite for Python array API standard complianceβ68Updated 2 weeks ago
- The DBT of ML, as Aligned describes data dependencies in ML systems, and reduce technical data debtβ60Updated this week
- Dockerized Jupyter kernelsβ59Updated 4 years ago
- Cloud provider cluster managers for Dask. Supports AWS, Google Cloud Azure and more...β144Updated 3 weeks ago
- spock is a framework that helps manage complex parameter configurations during research and development of Python applicationsβ138Updated last year