Sample examples of how to call collective operation functions on multi-GPU environments. A simple example of using broadcast, reduce, allGather, reduceScatter and sendRecv operations.
☆35Aug 28, 2023Updated 2 years ago
Alternatives and similar repositories for NCCL
Users that are interested in NCCL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- NCCL Examples from Official NVIDIA NCCL Developer Guide.☆20May 29, 2018Updated 7 years ago
- CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API☆35Sep 15, 2023Updated 2 years ago
- [CF ’20] Verified Instruction-Level Energy Consumption Measurement for NVIDIA GPUs☆15Dec 11, 2020Updated 5 years ago
- Pure Rust implementation of the post-quantum secure digital signature scheme FAEST☆17Feb 19, 2026Updated last month
- A basic repository for a Clang-based tool, with CMake integration.☆10Sep 22, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Github repository for "Big Data in Astrophysics" - Spring 2021☆15Apr 26, 2021Updated 4 years ago
- TLLM_QMM strips the implementation of quantized kernels of Nvidia's TensorRT-LLM, removing NVInfer dependency and exposes ease of use Pyt…☆16Jul 5, 2024Updated last year
- Benchmarks☆18Apr 28, 2025Updated 10 months ago
- ☆163Dec 27, 2024Updated last year
- Acoustic reverse-time migration using GPU card and POSIX thread based on the adaptive optimal finite-difference scheme and the hybrid abs…☆17Jan 27, 2018Updated 8 years ago
- ☆16Aug 20, 2024Updated last year
- Student handbook for the Applied Galactic Dynamics School at the Flatiron Institute (2021)☆11Jul 6, 2021Updated 4 years ago
- Knowledge-Augmented Language Models for Cause-Effect Relation Classification https://arxiv.org/abs/2112.08615☆14Jun 14, 2023Updated 2 years ago
- ☆26Feb 17, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Github repository for "Big Data in Astrophysics" - Spring 2022☆14Apr 27, 2022Updated 3 years ago
- GEMM by WMMA (tensor core)☆15Jul 31, 2022Updated 3 years ago
- Multiple-precision GPU accelerated linear algebra routines (dense and sparse) based on residue number system☆21Dec 19, 2022Updated 3 years ago
- A Easy-to-understand TensorOp Matmul Tutorial☆422Mar 5, 2026Updated 3 weeks ago
- ☆27Jan 8, 2024Updated 2 years ago
- Automated testing for XML XPath execution☆18Jan 5, 2024Updated 2 years ago
- MISO: Exploiting Multi-Instance GPU Capability on Multi-Tenant GPU Clusters☆21Apr 21, 2023Updated 2 years ago
- Distributed k-nearest Neighbors using Locality Sensitive Hashing and SYCL☆10Jun 7, 2021Updated 4 years ago
- Docker image for running Vivado in a 64-bit Debian Jessie container☆13Mar 17, 2018Updated 8 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Surrogate-based Hyperparameter Tuning System☆29Jun 29, 2023Updated 2 years ago
- Repository to hold the lecture and tutorial notes for the 2022 GW school☆14Jul 24, 2022Updated 3 years ago
- Collection of validation scripts, notebooks, results☆11Dec 23, 2025Updated 3 months ago
- ☆17Jul 8, 2021Updated 4 years ago
- CSCI 3753 - Operating Systems, Spring 2015☆12Feb 28, 2021Updated 5 years ago
- NCCL Tests☆1,463Mar 11, 2026Updated 2 weeks ago
- Notebooks for PHYS 440/540 at Drexel University☆16Dec 5, 2024Updated last year
- Memory footprint reduction for transformer models☆11Jan 24, 2023Updated 3 years ago
- A C++11 high performance webserver,支持多线程,单线程,使用Reactor模型,仿照muduo库的one loop per thread☆12Aug 3, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- VASim is a virtual homogeneous non-deterministic finite automata automata simulator and transformation tool. VASim can parse, transform, …☆36May 17, 2024Updated last year
- Notebooks utilizados en el programa de Quantum Scholars 2023☆19Aug 13, 2023Updated 2 years ago
- Auto-differentiation library for C++☆12Jan 16, 2022Updated 4 years ago
- Karmada APIs☆15Mar 10, 2026Updated 2 weeks ago
- iGniter, an interference-aware GPU resource provisioning framework for achieving predictable performance of DNN inference in the cloud.☆39Jun 11, 2024Updated last year
- Pypi Fetcher for Nix with simplified interface. (contains hashes for all packages)☆15Nov 7, 2023Updated 2 years ago
- Benchmarks for python☆27Jun 6, 2025Updated 9 months ago