Sample examples of how to call collective operation functions on multi-GPU environments. A simple example of using broadcast, reduce, allGather, reduceScatter and sendRecv operations.
☆35Aug 28, 2023Updated 2 years ago
Alternatives and similar repositories for NCCL
Users that are interested in NCCL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- NCCL Examples from Official NVIDIA NCCL Developer Guide.☆20May 29, 2018Updated 7 years ago
- Tutorials for NVIDIA CUPTI samples☆64Nov 3, 2025Updated 6 months ago
- [CF ’20] Verified Instruction-Level Energy Consumption Measurement for NVIDIA GPUs☆15Dec 11, 2020Updated 5 years ago
- Pure Rust implementation of the post-quantum secure digital signature scheme FAEST☆19Apr 29, 2026Updated last week
- Combined solution from Matter Labs and Yrrid based on their respective submissions for the Z-Prize category Accelerating MSM Operations o…☆16Oct 30, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- TLLM_QMM strips the implementation of quantized kernels of Nvidia's TensorRT-LLM, removing NVInfer dependency and exposes ease of use Pyt…☆16Jul 5, 2024Updated last year
- Benchmarks☆19Updated this week
- ☆165Dec 27, 2024Updated last year
- Acoustic reverse-time migration using GPU card and POSIX thread based on the adaptive optimal finite-difference scheme and the hybrid abs…☆17Jan 27, 2018Updated 8 years ago
- OPHELib is an optimized library for partially homomorphic encryption. It currently provides an implementation of the Paillier encryption …☆15May 29, 2019Updated 6 years ago
- Datalog Engines OPtimization Tester.☆13Jan 18, 2024Updated 2 years ago
- Knowledge-Augmented Language Models for Cause-Effect Relation Classification https://arxiv.org/abs/2112.08615☆14Jun 14, 2023Updated 2 years ago
- ☆26Feb 17, 2025Updated last year
- ☆16Aug 20, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- a QEMU + gem5 co-simulation framework for AMD MI300X GPU research.☆45Apr 29, 2026Updated last week
- GEMM by WMMA (tensor core)☆15Jul 31, 2022Updated 3 years ago
- A Easy-to-understand TensorOp Matmul Tutorial☆428Mar 5, 2026Updated 2 months ago
- Code to export Mendeley documents and metadata into Notion databases☆12Feb 6, 2023Updated 3 years ago
- ☆27Jan 8, 2024Updated 2 years ago
- Automated testing for XML XPath execution☆18Jan 5, 2024Updated 2 years ago
- Distributed k-nearest Neighbors using Locality Sensitive Hashing and SYCL☆10Jun 7, 2021Updated 4 years ago
- Surrogate-based Hyperparameter Tuning System☆30Jun 29, 2023Updated 2 years ago
- ☆17Jul 8, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Notebooks for PHYS 440/540 at Drexel University☆16Dec 5, 2024Updated last year
- NCCL Tests☆1,505Apr 13, 2026Updated 3 weeks ago
- Memory footprint reduction for transformer models☆11Jan 24, 2023Updated 3 years ago
- A repository for sharing D3.js plugins.☆12Jun 4, 2015Updated 10 years ago
- A C++11 high performance webserver,支持多线程,单线程,使用Reactor模型,仿照muduo库的one loop per thread☆12Aug 3, 2023Updated 2 years ago
- Script for doing Slurm Calculations☆12Mar 21, 2025Updated last year
- Fast semantic search for biorXiv manuscripts☆12Feb 16, 2025Updated last year
- Karmada APIs☆15Mar 10, 2026Updated last month
- iGniter, an interference-aware GPU resource provisioning framework for achieving predictable performance of DNN inference in the cloud.☆39Jun 11, 2024Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Sample Codes using NVSHMEM on Multi-GPU☆30Jan 22, 2023Updated 3 years ago
- Fortran bindings to the C++ Standard Library.☆34Apr 7, 2025Updated last year
- A depletion framework for OpenMC☆15Nov 27, 2017Updated 8 years ago
- ☆25Jan 20, 2026Updated 3 months ago
- Distributed IO-aware Attention algorithm☆24Sep 24, 2025Updated 7 months ago
- ☆12Aug 30, 2024Updated last year
- Automatic Differentiation for high-performance stencil loops☆13Mar 25, 2021Updated 5 years ago