HabanaAI / hccl_demoLinks
☆23Updated last week
Alternatives and similar repositories for hccl_demo
Users that are interested in hccl_demo are comparing it to the libraries listed below
Sorting:
- RCCL Performance Benchmark Tests☆75Updated last week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆63Updated 3 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆83Updated this week
- oneAPI Collective Communications Library (oneCCL)☆245Updated 3 weeks ago
- ☆22Updated this week
- A hierarchical collective communications library with portable optimizations☆36Updated 10 months ago
- SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs☆41Updated this week
- Ongoing research training transformer models at scale☆29Updated this week
- Reference implementations of MLPerf™ HPC training benchmarks☆49Updated 7 months ago
- oneAPI Level Zero Conformance & Performance test content☆57Updated this week
- ROCm Communication Collectives Library (RCCL)☆389Updated this week
- NCCL Profiling Kit☆145Updated last year
- Microsoft Collective Communication Library☆66Updated 10 months ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆119Updated this week
- ☆48Updated this week
- oneCCL Bindings for Pytorch*☆102Updated 2 months ago
- Bandwidth test for ROCm☆66Updated this week
- NVIDIA NCCL Tests for Distributed Training☆114Updated this week
- Multi-GPU communication profiler and visualizer☆35Updated last year
- A tool for bandwidth measurements on NVIDIA GPUs.☆545Updated 6 months ago
- CUDA GPU Benchmark☆33Updated 8 months ago
- Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs☆12Updated 6 months ago
- Unified Collective Communication Library☆277Updated last week
- NCCL Examples from Official NVIDIA NCCL Developer Guide.☆19Updated 7 years ago
- Magnum IO community repo☆100Updated last month
- CloudAI Benchmark Framework☆71Updated this week
- ☆46Updated 10 months ago
- MSCCL++: A GPU-driven communication stack for scalable AI applications☆421Updated last week
- RDMA and SHARP plugins for nccl library☆209Updated last month
- ☆59Updated this week