kubeflow / trainerLinks
Distributed ML Training and Fine-Tuning on Kubernetes
☆1,792Updated this week
Alternatives and similar repositories for trainer
Users that are interested in trainer are comparing it to the libraries listed below
Sorting:
- A CLI for Kubeflow.☆777Updated this week
- Automated Machine Learning on Kubernetes☆1,581Updated 2 weeks ago
- Kubeflow Deployment Manifests☆913Updated this week
- A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC☆1,091Updated 2 years ago
- NVIDIA device plugin for Kubernetes☆3,238Updated this week
- GPU Sharing Scheduler for Kubernetes Cluster☆1,472Updated last year
- Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)☆480Updated 2 weeks ago
- A Cloud Native Batch System (Project under CNCF)☆4,696Updated this week
- A toolkit to run Ray applications on Kubernetes☆1,786Updated this week
- Standardized Serverless ML Inference Platform on Kubernetes☆4,202Updated this week
- Run your deep learning workloads on Kubernetes more easily and efficiently.☆522Updated last year
- GPU Sharing Device Plugin for Kubernetes Cluster☆481Updated 2 years ago
- Information about the Kubeflow community including proposals and governance information.☆173Updated last week
- PyTorch on Kubernetes☆309Updated 3 years ago
- Machine Learning Pipelines for Kubeflow☆3,842Updated this week
- NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes☆2,139Updated this week
- ☆875Updated last year
- Kubernetes-native Job Queueing☆1,798Updated this week
- Repository for out-of-tree scheduler plugins based on scheduler framework.☆1,203Updated last week
- Kubeflow’s superfood for Data Scientists☆633Updated 2 years ago
- A repository to host extended examples and tutorials☆1,438Updated last month
- Docker for Your ML/DL Models Based on OCI Artifacts☆467Updated last year
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆457Updated this week
- Machine Learning Toolkit for Kubernetes☆14,998Updated last month
- Apache YuniKorn Core☆931Updated 3 weeks ago
- Controller for ModelMesh☆230Updated 3 weeks ago
- A QoS-based scheduling system brings optimal layout and status to workloads such as microservices, web services, big data jobs, AI jobs, …☆1,518Updated this week
- Kubeflow Website☆169Updated this week
- Python SDK for building, training, and deploying ML models☆337Updated 3 years ago
- Repo for the controller-runtime subproject of kubebuilder (sig-apimachinery)☆2,721Updated this week