LLNL / AluminumLinks

High-performance, GPU-aware communication library

☆86

Alternatives and similar repositories for Aluminum

Users that are interested in Aluminum are comparing it to the libraries listed below

Sorting:

ecrc / kblas-gpu
Subset of BLAS routines optimized for NVIDIA GPUs
☆76Updated 2 years ago
NVIDIA / df-nvshmem-prototype
Prototype of OpenSHMEM for NVIDIA GPUs, developed as part of DoE Design Forward
☆25Updated 7 years ago
llnl / RAJAPerf
RAJA Performance Suite
☆130Updated this week
Sandia-OpenSHMEM / SOS
Sandia OpenSHMEM is an implementation of the OpenSHMEM specification over multiple Networking APIs, including Portals 4, the Open Fabric …
☆77Updated 5 months ago
NVIDIA / mpi-acx
MPI accelerator-integrated communication extensions
☆39Updated 2 years ago
eth-cscs / COSMA
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
☆212Updated this week
llnl / LULESH
Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)
☆115Updated 2 years ago
ROCm / rocprofiler-compute
[DEPRECATED] Moved to ROCm/rocm-systems repo
☆165Updated last week
pssrawat / artemis
GPU Code optimizer for stencil computations. Refer to our IPDPS'19 paper for more details
☆25Updated 6 years ago
eth-cscs / Tiled-MM
Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.
☆32Updated 10 months ago
mlcommons / hpc
Reference implementations of MLPerf™ HPC training benchmarks
☆49Updated 11 months ago
StanfordLegion / task-bench
A task benchmark
☆44Updated last year
ROCm / rocHPL
High Performance Linpack for Next-Generation AMD HPC Accelerators
☆65Updated 2 months ago
jeffhammond / dpcpp-tutorial
Intel Data Parallel C++ (and SYCL 2020) Tutorial.
☆95Updated 4 years ago
cyanguwa / nersc-roofline
☆49Updated 5 years ago
nerscadmin / IPM
Integrated Performance Monitoring for High Performance Computing
☆91Updated 4 years ago
llnl / mpiP
A light-weight MPI profiler.
☆105Updated 4 months ago
ECP-VeloC / VELOC
Very-Low Overhead Checkpointing System
☆59Updated 6 months ago
ICLDisco / parsec
PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core …
☆76Updated 3 months ago
llnl / Quicksilver
A proxy app for the Monte Carlo Transport Code, Mercury. LLNL-CODE-684037
☆47Updated 2 years ago
llnl / Comb
Comb is a communication performance benchmarking tool.
☆26Updated 2 years ago
llnl / mpibind
Pragmatic, Productive, and Portable Affinity for HPC
☆51Updated 3 weeks ago
merthidayetoglu / CommBench
A Micro-benchmarking Tool for HPC Networks
☆34Updated 5 months ago
openucx / ucc
Unified Collective Communication Library
☆291Updated last week
UoB-HPC / BabelStream
STREAM, for lots of devices written in many programming models
☆355Updated 5 months ago
gunrock / loops
🎃 GPU load-balancing library for regular and irregular computations.
☆66Updated 5 months ago
openucx / xccl
☆26Updated 4 years ago
szcompressor / cuSZ
A GPU accelerated error-bounded lossy compression for scientific data.
☆95Updated last month
ROCm / rocSPARSE
[DEPRECATED] Moved to ROCm/rocm-libraries repo
☆134Updated 2 weeks ago
ROCm / rocHPCG
HPCG benchmark based on ROCm platform
☆39Updated last week