uxlfoundation / oneCCLLinks

oneAPI Collective Communications Library (oneCCL)

☆241

Alternatives and similar repositories for oneCCL

Users that are interested in oneCCL are comparing it to the libraries listed below

Sorting:

openucx / ucc
Unified Collective Communication Library
☆263Updated this week
ROCm / rccl
ROCm Communication Collectives Library (RCCL)
☆355Updated this week
gpudirect / libgdsync
GPUDirect Async support for IB Verbs
☆128Updated 2 years ago
ROCm / Tensile
[DEPRECATED] Moved to ROCm/rocm-libraries repo
☆247Updated this week
ROCm / rccl-tests
RCCL Performance Benchmark Tests
☆71Updated this week
NVIDIA / MagnumIO
Magnum IO community repo
☆95Updated 2 months ago
oneapi-src / level-zero-tests
oneAPI Level Zero Conformance & Performance test content
☆54Updated this week
ROCm / rocSHMEM
rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.
☆97Updated this week
Mellanox / nv_peer_memory
☆362Updated last year
intel / pti-gpu
Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysi…
☆234Updated last month
ROCm / rocprofiler
ROC profiler library. Profiling with perf-counters and derived metrics.
☆151Updated 2 weeks ago
intel / xetla
☆62Updated 7 months ago
intel / torch-ccl
oneCCL Bindings for Pytorch*
☆99Updated 3 weeks ago
merthidayetoglu / HiCCL
A hierarchical collective communications library with portable optimizations
☆36Updated 7 months ago
NVIDIA / nvbandwidth
A tool for bandwidth measurements on NVIDIA GPUs.
☆496Updated 3 months ago
intel / intel-xpu-backend-for-triton
OpenAI Triton backend for Intel® GPUs
☆197Updated this week
Mellanox / nccl-rdma-sharp-plugins
RDMA and SHARP plugins for nccl library
☆200Updated last month
ekondis / mixbench
A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)
☆412Updated 6 months ago
ROCm / hipBLAS
[DEPRECATED] Moved to ROCm/rocm-libraries repo
☆145Updated this week
ROCm / rocm_bandwidth_test
Bandwidth test for ROCm
☆62Updated this week
NVIDIA / gds-nvidia-fs
NVIDIA GPUDirect Storage Driver
☆272Updated 3 months ago
facebookresearch / param
PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…
☆147Updated last week
oneapi-src / level-zero
oneAPI Level Zero Specification Headers and Loader
☆274Updated this week
NVIDIA / Fuser
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
☆345Updated this week
intel / intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…
☆61Updated last month
microsoft / mscclpp
MSCCL++: A GPU-driven communication stack for scalable AI applications
☆394Updated this week
UoB-HPC / BabelStream
STREAM, for lots of devices written in many programming models
☆346Updated 11 months ago
daadaada / turingas
Assembler for NVIDIA Volta and Turing GPUs
☆226Updated 3 years ago
ROCm / rocprofiler-compute
Advanced Profiling and Analytics for AMD Hardware
☆161Updated this week
ROCm / TransferBench
TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)
☆44Updated this week