NVIDIA / gpu-operatorLinks
NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
☆2,474Updated this week
Alternatives and similar repositories for gpu-operator
Users that are interested in gpu-operator are comparing it to the libraries listed below
Sorting:
- NVIDIA device plugin for Kubernetes☆3,612Updated this week
- NVIDIA GPU metrics exporter for Prometheus leveraging DCGM☆1,557Updated 3 weeks ago
- NVIDIA DRA Driver for GPUs☆526Updated last week
- Kubernetes-native Job Queueing☆2,237Updated last week
- KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale☆1,059Updated this week
- GPU Sharing Scheduler for Kubernetes Cluster☆1,521Updated 2 years ago
- Kubeflow Deployment Manifests☆980Updated last week
- Heterogeneous AI Computing Virtualization Middleware(Project under CNCF)☆2,847Updated this week
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆645Updated 2 weeks ago
- Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster☆364Updated 2 weeks ago
- GPU plugin to the node feature discovery for Kubernetes☆308Updated last year
- Node feature discovery for Kubernetes☆981Updated last week
- Distributed AI Model Training and Fine-Tuning on Kubernetes☆1,997Updated this week
- A toolkit to run Ray applications on Kubernetes☆2,245Updated this week
- Tools for monitoring NVIDIA GPUs on Linux☆1,064Updated 4 years ago
- CLI and validation tools for Kubelet Container Runtime Interface (CRI) .☆1,930Updated last week
- Repository for out-of-tree scheduler plugins based on scheduler framework.☆1,261Updated last month
- Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)☆504Updated 2 weeks ago
- Gateway API Inference Extension☆559Updated this week
- Simple Kubernetes Operator for MinIO clusters☆1,410Updated 3 weeks ago
- GPU Sharing Device Plugin for Kubernetes Cluster☆491Updated 2 years ago
- NVIDIA device plugin for Kubernetes☆49Updated last year
- Dynamically provisioning persistent local storage with Kubernetes☆2,736Updated this week
- ☆890Updated last year
- NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs☆638Updated last month
- Repository for the next iteration of composite service (e.g. Ingress) and load balancing APIs.☆2,548Updated this week
- Nvidia GPU exporter for prometheus using nvidia-smi binary☆1,368Updated this week
- This driver allows Kubernetes to access NFS server on Linux node.☆1,197Updated 2 weeks ago
- MIG Partition Editor for NVIDIA GPUs☆235Updated 2 weeks ago
- A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC☆1,091Updated 2 years ago