LLNL / Aluminum
High-performance, GPU-aware communication library
☆84Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for Aluminum
- ☆22Updated 3 years ago
- Prototype of OpenSHMEM for NVIDIA GPUs, developed as part of DoE Design Forward☆20Updated 6 years ago
- OpenSHMEM Implementation on MPI☆25Updated 2 months ago
- Sandia OpenSHMEM is an implementation of the OpenSHMEM specification over multiple Networking APIs, including Portals 4, the Open Fabric …☆63Updated this week
- RAJA Performance Suite☆110Updated this week
- XSBench: The Monte Carlo Macroscopic Cross Section Lookup Benchmark☆72Updated 7 months ago
- ☆41Updated 4 years ago
- Comb is a communication performance benchmarking tool.☆23Updated last year
- HPCG benchmark based on ROCm platform☆35Updated last week
- oneAPI Collective Communications Library (oneCCL)☆201Updated this week
- gossip: Efficient Communication Primitives for Multi-GPU Systems☆58Updated 2 years ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆22Updated last month
- Next generation LAPACK implementation for ROCm platform☆93Updated this week
- Reference implementations of MLPerf™ HPC training benchmarks☆41Updated 5 months ago
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆99Updated this week
- A Micro-benchmarking Tool for HPC Networks☆19Updated last week
- A task benchmark☆39Updated 3 months ago
- A light-weight MPI profiler.☆83Updated 3 months ago
- CUDA Tensor Transpose (cuTT) library☆49Updated 7 years ago
- PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core …☆50Updated this week
- Integrated Performance Monitoring for High Performance Computing☆85Updated 3 years ago
- Collective library☆8Updated 3 years ago
- Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)☆100Updated last year
- ROC_SHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆39Updated last year
- SCR caches checkpoint data in storage on the compute nodes of a Linux cluster to provide a fast, scalable checkpoint / restart capability…☆99Updated last week
- Unified Collective Communication Library☆205Updated this week
- YASK--Yet Another Stencil Kit: a domain-specific language and framework to create high-performance stencil code for implementing finite-d…☆104Updated 3 months ago
- Partitioned Global Address Space (PGAS) library for distributed arrays☆100Updated this week
- sparse matrix pre-processing library☆81Updated 6 months ago
- Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite☆60Updated 6 years ago