NVIDIA / gpu-operatorLinks
NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
☆2,395Updated this week
Alternatives and similar repositories for gpu-operator
Users that are interested in gpu-operator are comparing it to the libraries listed below
Sorting:
- NVIDIA device plugin for Kubernetes☆3,533Updated this week
- NVIDIA GPU metrics exporter for Prometheus leveraging DCGM☆1,472Updated last week
- GPU Sharing Scheduler for Kubernetes Cluster☆1,514Updated last year
- NVIDIA DRA Driver for GPUs☆477Updated last week
- Heterogeneous AI Computing Virtualization Middleware(Project under CNCF)☆2,615Updated this week
- KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale☆899Updated last week
- Kubernetes-native Job Queueing☆2,057Updated last week
- A toolkit to run Ray applications on Kubernetes☆2,129Updated this week
- Kubeflow Deployment Manifests☆962Updated this week
- Tools for monitoring NVIDIA GPUs on Linux☆1,059Updated 4 years ago
- Distributed AI Model Training and Fine-Tuning on Kubernetes☆1,957Updated last week
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆611Updated this week
- GPU plugin to the node feature discovery for Kubernetes☆306Updated last year
- Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster☆354Updated 2 weeks ago
- Node feature discovery for Kubernetes☆959Updated this week
- ☆887Updated last year
- Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)☆497Updated this week
- This driver allows Kubernetes to access NFS server on Linux node.☆1,165Updated this week
- Repository for out-of-tree scheduler plugins based on scheduler framework.☆1,242Updated 2 weeks ago
- Tools for building GPU clusters☆1,403Updated 4 months ago
- Dynamically provisioning persistent local storage with Kubernetes☆2,666Updated 2 weeks ago
- Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes☆4,746Updated this week
- GPU Sharing Device Plugin for Kubernetes Cluster☆490Updated 2 years ago
- CSI driver for Ceph☆1,476Updated last week
- A Cloud Native Batch System (Project under CNCF)☆5,066Updated this week
- NVIDIA container runtime library☆1,037Updated last week
- Simple Kubernetes Operator for MinIO clusters☆1,388Updated last month
- CLI and validation tools for Kubelet Container Runtime Interface (CRI) .☆1,884Updated last week
- A CNI meta-plugin for multi-homed pods in Kubernetes☆2,718Updated this week
- NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs☆609Updated last month