NVIDIA NCCL Tests for Distributed Training
☆144Apr 29, 2026Updated 3 weeks ago
Alternatives and similar repositories for nccl-tests
Users that are interested in nccl-tests are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆50Updated this week
- Getting Started with the CoreWeave Kubernetes GPU Cloud☆83Jun 13, 2025Updated 11 months ago
- NCCL Tests☆1,529Updated this week
- RDMA CNI plugin for containerized workloads☆60Updated this week
- NVIDIA Network Operator☆341Updated this week
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the …☆291Updated this week
- This repo includes everything you need to know about deploying GPU nodes on OCI☆49Updated this week
- Folding @ Home with NVIDIA GPU support☆18Mar 20, 2020Updated 6 years ago
- RDMA and SHARP plugins for nccl library☆231Apr 3, 2026Updated last month
- InfiniBand fabric monitoring daemon written in Go☆32May 22, 2025Updated last year
- A tool for bandwidth measurements on NVIDIA GPUs.☆700Apr 8, 2026Updated last month
- ☆40Apr 24, 2026Updated last month
- A toolkit for discovering cluster network topology.☆130Updated this week
- Infiniband Verbs Performance Tests☆964Apr 15, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- GPUd automates monitoring, diagnostics, and issue identification for GPUs☆481Updated this week
- Synthesizer for optimal collective communication algorithms☆123Apr 8, 2024Updated 2 years ago
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆155May 6, 2026Updated 2 weeks ago
- Multi-network CRD specification☆53Apr 11, 2024Updated 2 years ago
- NCCL Profiling Kit☆153Jul 1, 2024Updated last year
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆91Updated this week
- Module, Model, and Tensor Serialization/Deserialization☆308Apr 30, 2026Updated 3 weeks ago
- knavigator is a development, testing, and optimization toolkit for AI/ML scheduling systems at scale on Kubernetes.☆78Apr 14, 2026Updated last month
- ☆26May 19, 2021Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- This is the public repo for the MLPerf DeepCAM climate data segmentation proposal.☆16Sep 30, 2025Updated 7 months ago
- MSCCL++: A GPU-driven communication stack for scalable AI applications☆520Updated this week
- HPC tests using MPI codes & synthetic benchmarks with IB/RoCE comparisions - from StackHPC Ltd.☆22Jul 11, 2022Updated 3 years ago
- SC24 Deep Learning at Scale Tutorial Material☆35Feb 5, 2025Updated last year
- Health checks for Azure N- and H-series VMs.☆57May 13, 2026Updated last week
- an implementation of parallel skills like amp, ddp, pp, tp for learning purposes☆14Nov 18, 2023Updated 2 years ago
- InfiniBand SR-IOV CNI☆58Updated this week
- ☆18Jan 24, 2019Updated 7 years ago
- Collection of scripts used for BlueField SoC system management.☆31Apr 9, 2026Updated last month
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A collection of useful Go libraries to ease the development of NVIDIA Operators for GPU/NIC management.☆30May 18, 2026Updated last week
- Mini CCL - A lightweight collective communication library☆32Jan 2, 2026Updated 4 months ago
- GPUDirect Async support for IB Verbs☆137Nov 10, 2022Updated 3 years ago
- HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container☆309May 18, 2026Updated last week
- Linux based user-space RSHIM driver for the Mellanox BlueField SoC☆36May 15, 2026Updated last week
- Orchestrating many small GPU clusters for running serverless GPU workloads☆17Mar 15, 2026Updated 2 months ago
- Optimized primitives for collective multi-GPU communication☆4,729Updated this week