NVIDIA / gpu-operatorLinks
NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
☆2,511Updated this week
Alternatives and similar repositories for gpu-operator
Users that are interested in gpu-operator are comparing it to the libraries listed below
Sorting:
- NVIDIA device plugin for Kubernetes☆3,638Updated this week
- NVIDIA GPU metrics exporter for Prometheus leveraging DCGM☆1,590Updated last month
- NVIDIA DRA Driver for GPUs☆548Updated this week
- Kubernetes-native Job Queueing☆2,270Updated last week
- KAI Scheduler is an open source Kubernetes Native scheduler for AI workloads at large scale☆1,095Updated this week
- LeaderWorkerSet: An API for deploying a group of pods as a unit of replication☆654Updated last week
- GPU Sharing Scheduler for Kubernetes Cluster☆1,523Updated 2 years ago
- Kubeflow Deployment Manifests☆986Updated 2 weeks ago
- A toolkit to run Ray applications on Kubernetes☆2,292Updated this week
- Heterogeneous AI Computing Virtualization Middleware(Project under CNCF)☆2,945Updated this week
- GPU plugin to the node feature discovery for Kubernetes☆308Updated last year
- Distributed AI Model Training and LLM Fine-Tuning on Kubernetes☆2,016Updated this week
- A Cloud Native Batch System (Project under CNCF)☆5,276Updated this week
- Gateway API Inference Extension☆573Updated this week
- Tools for monitoring NVIDIA GPUs on Linux☆1,065Updated 4 years ago
- Node feature discovery for Kubernetes☆989Updated last week
- Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster☆368Updated last week
- Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.☆1,653Updated last week
- Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)☆510Updated last week
- Repository for out-of-tree scheduler plugins based on scheduler framework.☆1,266Updated last month
- Simple Kubernetes Operator for MinIO clusters☆1,415Updated last month
- GPU Sharing Device Plugin for Kubernetes Cluster☆492Updated 3 years ago
- NVIDIA container runtime library☆1,064Updated last week
- Tools for building GPU clusters☆1,410Updated 3 weeks ago
- JobSet: a k8s native API for distributed ML training and HPC workloads☆300Updated this week
- ☆891Updated last year
- AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-te…☆1,134Updated this week
- Fast container image distribution plugin with lazy pulling☆1,471Updated this week
- CLI and validation tools for Kubelet Container Runtime Interface (CRI) .☆1,942Updated last week
- Dynamically provisioning persistent local storage with Kubernetes☆2,747Updated 2 weeks ago