NVIDIA / ansible-role-nvidia-docker
☆34Updated 11 months ago
Related projects: ⓘ
- ☆112Updated last month
- Tools to deploy GPU clusters in the Cloud☆30Updated last year
- Singularity implementation of k8s operator for interacting with SLURM.☆118Updated 3 years ago
- Scheduling GPU cluster workloads with Slurm☆73Updated 5 years ago
- A top-like tool for monitoring GPUs in a cluster☆80Updated 7 months ago
- The Singularity implementation of the Kubernetes Container Runtime Interface☆114Updated 3 years ago
- Kubernetes Operator, ansible playbooks, and production scripts for large-scale AIStore deployments on Kubernetes.☆66Updated last week
- MIG Partition Editor for NVIDIA GPUs☆163Updated this week
- files and instructions for creating and using example containers from the sylabs.io blog☆103Updated last year
- NGC Container Replicator☆28Updated last year
- Run cloud native workloads on NVIDIA GPUs☆124Updated 2 weeks ago
- ☆82Updated this week
- server for storage and management of singularity images☆103Updated 2 months ago
- An open-source toolkit for deploying and managing high performance clusters for HPC, AI, and data analytics workloads.☆217Updated last week
- Slurm in Docker - Exploring Slurm using CentOS 7 based Docker images☆119Updated 4 years ago
- Container plugin for Slurm Workload Manager☆278Updated last month
- Fork of NVIDIA device plugin for Kubernetes with support for shared GPUs by declaring GPUs multiple times☆87Updated 2 years ago
- The NVIDIA Driver Manager is a Kubernetes component which assist in seamless upgrades of NVIDIA Driver on each node of the cluster.☆33Updated this week
- OCI-compatible engine to deploy Linux containers on HPC environments.☆129Updated 2 weeks ago
- Prometheus exporter for performance metrics from Slurm.☆227Updated 3 months ago
- GPU Environment Management for Visual Studio Code☆35Updated last year
- NGC VMI Example Scripts☆23Updated 5 years ago
- nvidiagpubeat is an elastic beat that uses NVIDIA System Management Interface (nvidia-smi) to monitor NVIDIA GPU devices and can ingest m…☆54Updated 3 years ago
- Deploy a Flux MiniCluster to Kubernetes with the operator☆31Updated last month
- GPU plugin to the node feature discovery for Kubernetes☆287Updated 3 months ago
- Container-based Slurm cluster with support for running on multiple ssh-accessible computers. Currently it is based on podman, systemd, no…☆20Updated 3 years ago
- Nvidia-smi Prometheus exporter with respecting of GPU-UUID☆32Updated last year
- Ansible role for installing and managing the Slurm Workload Manager☆84Updated 5 months ago
- DVC GitHub action☆30Updated 2 months ago
- Simple scripts and instructions for getting the most out of your kubernetes cluster. Includes scripts for Charmed Kubernetes and Microk8s…☆19Updated 4 years ago