1duo/nccl-examples

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/1duo/nccl-examples)

1duo / nccl-examples

NCCL Examples from Official NVIDIA NCCL Developer Guide.

☆21

Alternatives and similar repositories for nccl-examples

Users that are interested in nccl-examples are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

muriloboratto / NCCL
View on GitHub
Sample examples of how to call collective operation functions on multi-GPU environments. A simple example of using broadcast, reduce, all…
☆36Aug 28, 2023Updated 2 years ago
ajtulloch / sparse-ads-baselines
View on GitHub
☆10May 4, 2023Updated 3 years ago
IronySuzumiya / NiuDianNao
View on GitHub
A simple cycle-accurate DaDianNao simulator
☆13Mar 27, 2019Updated 7 years ago
hao-ai-lab / flash-attention-fp4
View on GitHub
NVFP4 Flash-Attention 4 on BlackWell
☆30Updated this week
parasailteam / coconet
View on GitHub
☆85Dec 2, 2022Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
NVIDIA / compute-sanitizer-samples
View on GitHub
Samples demonstrating how to use the Compute Sanitizer Tools and Public API
☆99Nov 6, 2023Updated 2 years ago
muriloboratto / NVSHEMEM
View on GitHub
Sample Codes using NVSHMEM on Multi-GPU
☆30Jan 22, 2023Updated 3 years ago
haanjack / mnist-cudnn
View on GitHub
CUDA for MNIST training/inference
☆44Dec 30, 2023Updated 2 years ago
enzorucci / SWIFOLD
View on GitHub
Smith-Waterman Acceleration on Intel’s FPGA with OpenCL for Long DNA Sequences
☆19Jan 25, 2019Updated 7 years ago
mohamed / roofline
View on GitHub
A simple script to plot the Roofline model for given HW platforms and applications
☆10Mar 17, 2026Updated 4 months ago
gzz2000 / RoSSH
View on GitHub
🛠Robust SSH: auto-reconnect SSH session that preserves your running shell and command. Intuitive, no server-side setup, aimed at simplic…
☆13Nov 14, 2025Updated 8 months ago
kriegalex / vscode-cuda
View on GitHub
CUDA C++ syntax support & snippets for VSCode
☆20Apr 1, 2021Updated 5 years ago
gitHubwhl562916378 / IntelCudaVideoDecodDetect
View on GitHub
使用ffmpeg-4.3中的cuda和qsv硬解，并进行封装，用qopenglwidget显示；用来测试视频解码性能，及显示
☆19Feb 24, 2021Updated 5 years ago
YYYYYHC / Measure_Cache_Size
View on GitHub
This is a simple implementation of Saavedra-Barrera's paper SAAVEDRA-BARRERA R H. CPU Performance Evaluation and Execution Time Predictio…
☆10Nov 23, 2021Updated 4 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
gpudirect / libmp
View on GitHub
Simple message passing library
☆30Aug 28, 2018Updated 7 years ago
ROCm / aws-ofi-rccl
View on GitHub
☆18Nov 11, 2025Updated 8 months ago
PawseySC / rocm-from-source
View on GitHub
Scripts to build AMD ROCm from source.
☆16Oct 31, 2024Updated last year
gmyrianthous / example-publish-pypi
View on GitHub
Example Python package to demonstrate how to publish packages on PyPI
☆14Jan 8, 2022Updated 4 years ago
DE-RSE / de-rse.github.io
View on GitHub
Web repository
☆14Jul 6, 2026Updated 2 weeks ago
jeremad / cuda-travis
View on GitHub
☆18Aug 22, 2019Updated 6 years ago
MingliSun / MLIR-TVM
View on GitHub
☆13Nov 25, 2019Updated 6 years ago
jeng1220 / cuGemmProf
View on GitHub
A simple tool to profile performance of multiple combinations of GEMM of cuBLAS
☆25Feb 9, 2021Updated 5 years ago
qiime2 / q2studio
View on GitHub
Prototype graphical user interface for QIIME 2
☆15Feb 27, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
kuroko1t / nne
View on GitHub
convert a pytorch model to a model for edge device
☆20Mar 15, 2023Updated 3 years ago
parabix / parabix-devel-mirror
View on GitHub
Automated Mirror of Parabix - Note: the repository does not accept github pull requests at this moment. Please submit your patches at the…
☆17Aug 12, 2021Updated 4 years ago
gty111 / GEMM_WMMA
View on GitHub
GEMM by WMMA (tensor core)
☆15Jul 31, 2022Updated 3 years ago
microsoft / NPKit
View on GitHub
NCCL Profiling Kit
☆155Jul 1, 2024Updated 2 years ago
yuninxia / awesome-gemm
View on GitHub
📚 A curated list of awesome matrix-matrix multiplication (A * B = C) frameworks, libraries and software
☆67Feb 23, 2025Updated last year
hananshafi / MTL-ViT
View on GitHub
A new multi-task learning framework using Vision Transformers
☆11Jun 19, 2024Updated 2 years ago
flashinfer-ai / debug-print
View on GitHub
Debug print operator for cudagraph debugging
☆18Aug 2, 2024Updated last year
1duo / design-patterns-for-humans
View on GitHub
Design Patterns for Humans™ - An ultra-simplified explanation (examples in C++/Python)
☆10Nov 5, 2018Updated 7 years ago
wangsiping97 / GPU-Tutorials
View on GitHub
Tutorials to GPU programming. Reading notes.
☆19Apr 27, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
AOSC-Archive / newsroom
View on GitHub
Community newsletters, special issues, etc. for public platforms
☆10Aug 25, 2025Updated 10 months ago
blackjack2015 / NV-DVFS-Benchmark
View on GitHub
☆10Aug 21, 2023Updated 2 years ago
mcrl / tccl
View on GitHub
Thunder Research Group's Collective Communication Library
☆53Jul 8, 2025Updated last year
gpudirect / gdasync
View on GitHub
GPUDirect Async suite
☆16Dec 5, 2018Updated 7 years ago
lucidrains / flash-attention
View on GitHub
Fast and memory-efficient exact attention
☆20Jul 22, 2024Updated 2 years ago
mmocean / shttpd
View on GitHub
shttpd - HTTP服务器代码注释
☆16Sep 12, 2020Updated 5 years ago
microsoft / TE-CCL
View on GitHub
☆56Aug 27, 2024Updated last year