LLNL / AluminumLinks
High-performance, GPU-aware communication library
☆87Updated 4 months ago
Alternatives and similar repositories for Aluminum
Users that are interested in Aluminum are comparing it to the libraries listed below
Sorting:
- gossip: Efficient Communication Primitives for Multi-GPU Systems☆59Updated 2 years ago
- Reference implementations of MLPerf™ HPC training benchmarks☆48Updated 3 months ago
- Sandia OpenSHMEM is an implementation of the OpenSHMEM specification over multiple Networking APIs, including Portals 4, the Open Fabric …☆70Updated last month
- ☆23Updated 4 years ago
- RAJA Performance Suite☆117Updated last week
- Prototype of OpenSHMEM for NVIDIA GPUs, developed as part of DoE Design Forward☆24Updated 7 years ago
- A task benchmark☆42Updated 10 months ago
- oneAPI Collective Communications Library (oneCCL)☆234Updated 2 weeks ago
- GPUDirect Async support for IB Verbs☆117Updated 2 years ago
- High Performance Linpack for Next-Generation AMD HPC Accelerators☆54Updated 2 weeks ago
- CUDA Tensor Transpose (cuTT) library☆51Updated 7 years ago
- A hierarchical collective communications library with portable optimizations☆35Updated 6 months ago
- Unified Collective Communication Library☆256Updated this week
- Integrated Performance Monitoring for High Performance Computing☆89Updated 3 years ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆105Updated 7 years ago
- Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)☆108Updated 2 years ago
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆93Updated 3 years ago
- Kernel Tuning Toolkit☆59Updated 3 weeks ago
- ☆44Updated 4 years ago
- Comb is a communication performance benchmarking tool.☆25Updated 2 years ago
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆46Updated 10 years ago
- pytorch ucc plugin☆21Updated 3 years ago
- Next generation LAPACK implementation for ROCm platform☆101Updated this week
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆86Updated last week
- Advanced Profiling and Analytics for AMD Hardware☆156Updated this week
- OpenSHMEM Application Programming Interface☆56Updated 6 months ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆81Updated 5 years ago
- sparse matrix pre-processing library☆82Updated last year
- A GPU accelerated error-bounded lossy compression for scientific data.☆74Updated last week
- Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm☆206Updated last month