uxlfoundation / oneCCL
oneAPI Collective Communications Library (oneCCL)
☆223Updated last month
Alternatives and similar repositories for oneCCL:
Users that are interested in oneCCL are comparing it to the libraries listed below
- ROCm Communication Collectives Library (RCCL)☆303Updated this week
- oneCCL Bindings for Pytorch*☆89Updated 2 months ago
- Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysi…☆218Updated this week
- GPUDirect Async support for IB Verbs☆104Updated 2 years ago
- OpenAI Triton backend for Intel® GPUs☆165Updated this week
- oneAPI Level Zero Conformance & Performance test content☆48Updated this week
- Unified Collective Communication Library☆229Updated this week
- RDMA and SHARP plugins for nccl library☆177Updated last month
- ☆327Updated 10 months ago
- ☆60Updated 2 months ago
- NCCL Profiling Kit☆127Updated 8 months ago
- RCCL Performance Benchmark Tests☆59Updated last month
- Stretching GPU performance for GEMMs and tensor contractions.☆233Updated this week
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆104Updated 7 years ago
- Synthesizer for optimal collective communication algorithms☆104Updated 10 months ago
- NCCL Fast Socket is a transport layer plugin to improve NCCL collective communication performance on Google Cloud.☆115Updated last year
- A tool for bandwidth measurements on NVIDIA GPUs.☆375Updated 3 weeks ago
- collection of benchmarks to measure basic GPU capabilities☆302Updated 2 weeks ago
- STREAM, for lots of devices written in many programming models☆326Updated 6 months ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆263Updated last month
- Microsoft Collective Communication Library☆339Updated last year
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆128Updated last week
- ROC profiler library. Profiling with perf-counters and derived metrics.☆135Updated last week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆60Updated 2 months ago
- CUDA Kernel Benchmarking Library☆578Updated 3 months ago
- Magnum IO community repo☆84Updated last month
- oneAPI Level Zero Specification Headers and Loader☆238Updated this week
- This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.☆164Updated this week
- A hierarchical collective communications library with portable optimizations☆29Updated 2 months ago
- oneAPI Technical Advisory Board (TAB) Meeting Notes☆72Updated last year