NVIDIA/nvbandwidth

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NVIDIA/nvbandwidth)

NVIDIA / nvbandwidth

A tool for bandwidth measurements on NVIDIA GPUs.

☆732

Alternatives and similar repositories for nvbandwidth

Users that are interested in nvbandwidth are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

NVIDIA / nccl-tests
View on GitHub
NCCL Tests
☆1,595Jul 9, 2026Updated last week
Mellanox / nccl-rdma-sharp-plugins
View on GitHub
RDMA and SHARP plugins for nccl library
☆233Apr 3, 2026Updated 3 months ago
NVIDIA / gdrcopy
View on GitHub
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
☆1,399Updated this week
NVIDIA / DCGM
View on GitHub
NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs
☆762Jul 6, 2026Updated 2 weeks ago
NVIDIA / nvbench
View on GitHub
CUDA Kernel Benchmarking Library
☆900Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
NVIDIA / nccl
View on GitHub
Optimized primitives for collective multi-GPU communication
☆4,892Updated this week
NVIDIA / multi-gpu-programming-models
View on GitHub
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
☆908Sep 26, 2025Updated 9 months ago
ai-dynamo / nixl
View on GitHub
NVIDIA Inference Xfer Library (NIXL)
☆1,138Updated this week
microsoft / NPKit
View on GitHub
NCCL Profiling Kit
☆155Jul 1, 2024Updated 2 years ago
Mellanox / nv_peer_memory
View on GitHub
☆399Apr 23, 2024Updated 2 years ago
NVIDIA / nvshmem
View on GitHub
NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process com…
☆560Updated this week
NVIDIA / cuda-samples
View on GitHub
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
☆9,404May 27, 2026Updated last month
NVIDIA / NVTX
View on GitHub
The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…
☆544Updated this week
NVIDIA / nvloom
View on GitHub
nvloom is a set of tools designed to scalably test MNNVL fabrics.
☆50Apr 1, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
linux-rdma / perftest
View on GitHub
Infiniband Verbs Performance Tests
☆998Jul 12, 2026Updated last week
google / nccl-fastsocket
View on GitHub
NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.
☆125Nov 15, 2023Updated 2 years ago
meta-pytorch / torchcomms
View on GitHub
torchcomms: a modern PyTorch communications API
☆377Updated this week
NVIDIA / cutlass
View on GitHub
CUDA Templates and Python DSLs for High-Performance Linear Algebra
☆10,104Updated this week
microsoft / mscclpp
View on GitHub
MSCCL++: A GPU-driven communication stack for scalable AI applications
☆541Updated this week
coreweave / nccl-tests
View on GitHub
NVIDIA NCCL Tests for Distributed Training
☆149Jul 8, 2026Updated last week
RRZE-HPC / gpu-benches
View on GitHub
collection of benchmarks to measure basic GPU capabilities
☆530Oct 24, 2025Updated 8 months ago
NVIDIA / TransformerEngine
View on GitHub
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on H…
☆3,434Updated this week
flashinfer-ai / flashinfer
View on GitHub
FlashInfer: Kernel Library for LLM Serving
☆5,983Updated this week
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
microsoft / msccl
View on GitHub
Microsoft Collective Communication Library
☆394Sep 20, 2023Updated 2 years ago
NVIDIA / GPUStressTest
View on GitHub
GPU Stress Test is a tool to stress the compute engine of NVIDIA Tesla GPU’s by running a BLAS matrix multiply using different data types…
☆126Jul 8, 2025Updated last year
bytedance / flux
View on GitHub
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
☆1,343Aug 28, 2025Updated 10 months ago
NVIDIA / cuda-checkpoint
View on GitHub
CUDA checkpoint and restore utility
☆473Jul 6, 2026Updated 2 weeks ago
NVIDIA / gds-nvidia-fs
View on GitHub
NVIDIA GPUDirect Storage Driver
☆367Jun 1, 2026Updated last month
openucx / ucc
View on GitHub
Unified Collective Communication Library
☆310Updated this week
ByteDance-Seed / Triton-distributed
View on GitHub
Distributed Compiler based on Triton for Parallel Systems
☆1,493Jul 11, 2026Updated last week
uccl-project / uccl
View on GitHub
UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g…
☆1,465Updated this week
ROCm / rccl
View on GitHub
[DEPRECATED] Moved to ROCm/rocm-systems repo
☆419Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
NVIDIA / Fuser
View on GitHub
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
☆396May 31, 2026Updated last month
NVIDIA / cccl
View on GitHub
CUDA Core Compute Libraries
☆2,431Updated this week
kvcache-ai / Mooncake
View on GitHub
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
☆5,909Updated this week
KuangjuX / NVSHMEM-Tutorial
View on GitHub
NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer
☆195Feb 11, 2026Updated 5 months ago
NVIDIA / dcgm-exporter
View on GitHub
NVIDIA GPU metrics exporter for Prometheus leveraging DCGM
☆1,808Updated this week
osayamenja / FlashMoE
View on GitHub
Distributed MoE in a Single Kernel [NeurIPS '25]
☆272May 5, 2026Updated 2 months ago
Azure / azurehpc-health-checks
View on GitHub
Health checks for Azure N- and H-series VMs.
☆58Jun 10, 2026Updated last month