run-ai / docsLinks
markdown docs
☆89Updated last week
Alternatives and similar repositories for docs
Users that are interested in docs are comparing it to the libraries listed below
Sorting:
- Controller for ModelMesh☆232Updated last week
- A top-like tool for monitoring GPUs in a cluster☆84Updated last year
- MIG Partition Editor for NVIDIA GPUs☆201Updated last week
- Run cloud native workloads on NVIDIA GPUs☆180Updated last month
- ☆221Updated this week
- Distributed Model Serving Framework☆170Updated 2 weeks ago
- GPU plugin to the node feature discovery for Kubernetes☆300Updated last year
- Dynamic Resource Allocation (DRA) for NVIDIA GPUs in Kubernetes☆377Updated this week
- AWS virtual gpu device plugin provides capability to use smaller virtual gpus for your machine learning inference workloads☆205Updated last year
- ☆24Updated last month
- An Operator for deployment and maintenance of NVIDIA NIMs and NeMo microservices in a Kubernetes environment.☆114Updated this week
- Kubernetes Operator for MPI-based applications (distributed training, HPC, etc.)☆482Updated last month
- Holistic job manager on Kubernetes☆116Updated last year
- elastic-gpu-scheduler is a Kubernetes scheduler extender for GPU resources scheduling.☆141Updated 2 years ago
- Fork of NVIDIA device plugin for Kubernetes with support for shared GPUs by declaring GPUs multiple times☆88Updated 3 years ago
- Share GPU between Pods in Kubernetes☆209Updated 2 years ago
- NVIDIA NCCL Tests for Distributed Training☆97Updated this week
- Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.☆204Updated 2 months ago
- Automatic tuning for ML model deployment on Kubernetes☆80Updated 7 months ago
- NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs☆532Updated last month
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆113Updated this week
- A collection of YAML files, Helm Charts, Operator code, and guides to act as an example reference implementation for NVIDIA NIM deploymen…☆184Updated 2 weeks ago
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆98Updated this week
- NVIDIA Network Operator☆257Updated this week
- User documentation for KServe.☆106Updated 2 weeks ago
- Repository for open inference protocol specification☆56Updated last month
- An efficient GPU resource sharing system with fine-grained control for Linux platforms.☆83Updated last year
- ☆114Updated 2 weeks ago
- Module, Model, and Tensor Serialization/Deserialization☆240Updated last week
- Collection of tools and examples for managing Accelerated workloads in Kubernetes Engine☆235Updated this week