ROCm / gpu-operator
☆44Updated this week
Alternatives and similar repositories for gpu-operator:
Users that are interested in gpu-operator are comparing it to the libraries listed below
- The NVIDIA Driver Manager is a Kubernetes component which assist in seamless upgrades of NVIDIA Driver on each node of the cluster.☆35Updated last month
- ☆38Updated this week
- ☆35Updated last week
- IPAM plugin for kubernetes☆20Updated this week
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆92Updated this week
- This project provides a framework that runs Slurm in Kubernetes.☆77Updated last week
- Automatic repair for unhealthy Kubernetes nodes☆50Updated last week
- Operator for managing Node Feature Discovery deployment☆69Updated last month
- K8s device plugin for GPU sharing☆100Updated last year
- Easy Kubevirt images generator - Public images repository 💿☆73Updated 3 months ago
- KJob: Tool for CLI-loving ML researchers☆27Updated last week
- ☆143Updated last week
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆65Updated last week
- GenAI inference performance benchmarking tool☆39Updated 3 weeks ago
- ☆85Updated 7 months ago
- Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster☆320Updated this week
- Operator that deploys additional KubeVirt resources☆34Updated this week
- A Topology-Aware Custom Scheduler For Kubernetes☆63Updated last year
- InstaSlice facilitates the use of Dynamic Resource Allocation (DRA) on Kubernetes clusters for GPU sharing☆27Updated 4 months ago
- Example DRA driver that developers can fork and modify to get them started writing their own.☆69Updated last month
- 🧯 Kubernetes coverage for fault awareness and recovery, works for any LLMOps, MLOps, AI workloads.☆29Updated 4 months ago
- Dragonfly Helm Charts☆36Updated this week
- Kubernetes Cluster API Provider Virtink☆25Updated last year
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆93Updated last week
- Enabling Kubernetes to make pod placement decisions with platform intelligence.☆175Updated 2 months ago
- Documentation repository for NVIDIA Cloud Native Technologies☆22Updated last week
- NVIDIA vGPU Device Manager manages NVIDIA vGPU devices on top of Kubernetes☆129Updated this week
- Holistic job manager on Kubernetes☆115Updated last year
- VM specific tasks for Tekton Pipelines☆36Updated last week
- AppWrapper controller for Kueue☆10Updated last week