mayooot / build-nccl-tests-with-pytorch
Build NCCL-Tests and configure SSHD in PyTorch container to help you test NCCL faster!
☆9Updated 6 months ago
Related projects: ⓘ
- A Slurm cluster for Kubernetes☆36Updated last month
- NVIDIA NCCL Tests for Distributed Training☆59Updated last month
- InfiniBand SR-IOV CNI☆42Updated 2 weeks ago
- NVIDIA vGPU Device Manager manages NVIDIA vGPU devices on top of Kubernetes☆99Updated this week
- Device plugins for Volcano, e.g. GPU☆98Updated last week
- The NVIDIA Driver Manager is a Kubernetes component which assist in seamless upgrades of NVIDIA Driver on each node of the cluster.☆33Updated this week
- RDMA CNI plugin for containerized workloads☆39Updated 2 weeks ago
- ☆53Updated last week
- The BeeGFS Container Storage Interface (CSI) driver provides high performing and scalable storage for workloads running in Kubernetes. 📦…☆66Updated last month
- ☆26Updated 4 months ago
- elastic-gpu-scheduler is a Kubernetes scheduler extender for GPU resources scheduling.☆134Updated last year
- A general-purpose GPU monitor, witch can monitor GPU cards and the usage of each pods or containers.☆19Updated 2 years ago
- 3-k platform is for training LLMs☆13Updated last week
- MCAD v2☆10Updated 4 months ago
- ☆187Updated this week
- Public repository for the BeeGFS Parallel File System☆68Updated 2 months ago
- MIG Partition Editor for NVIDIA GPUs☆163Updated this week
- NVIDIA k8s device plugin for Kubevirt☆222Updated last month
- elastic-gpu-agent is a Kubernetes device plugin for GPU resources allocation on node.☆54Updated 2 years ago
- ☆57Updated 3 weeks ago
- The API (CRD) of Volcano☆33Updated last week
- A Lustre container storage interface that allows Kubernetes to mount/unmount provisioned Lustre filesystems into containers.☆26Updated 3 weeks ago
- Bitfusion with Kubernetes Integration Support☆51Updated 10 months ago
- DPDK & SR-IOV CNI plugin☆19Updated 4 years ago
- ☸️ Easy, advanced inference platform for large language models on Kubernetes☆15Updated this week
- slurm cluster over k8s☆14Updated 4 years ago
- ☆32Updated 3 weeks ago
- A kubernetes plugin which enables dynamically add or remove GPU resources for a running Pod☆118Updated 2 years ago
- HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container☆72Updated 2 weeks ago
- The NVIDIA GPU driver container allows the provisioning of the NVIDIA driver through the use of containers.☆62Updated this week