eth-cscs / COSMALinks
Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
☆212Updated 3 weeks ago
Alternatives and similar repositories for COSMA
Users that are interested in COSMA are comparing it to the libraries listed below
Sorting:
- Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)☆115Updated 2 years ago
- RAJA Performance Suite☆131Updated last week
- High-performance, GPU-aware communication library☆86Updated last month
- ☆101Updated this week
- This is a set of simple programs that can be used to explore the features of a parallel platform.☆472Updated last week
- STREAM, for lots of devices written in many programming models☆355Updated 5 months ago
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆348Updated 2 months ago
- NPBench - A Benchmarking Suite for High-Performance NumPy☆91Updated last week
- Data parallel C++ mathematical object library☆167Updated 3 weeks ago
- Kokkos C++ Performance Portability Programming Ecosystem: Math Kernels - Provides BLAS, Sparse BLAS and Graph Kernels☆372Updated last week
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆165Updated last week
- A light-weight MPI profiler.☆105Updated 4 months ago
- QUDA is a library for performing calculations in lattice QCD on GPUs.☆341Updated this week
- PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core …☆76Updated 3 months ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆133Updated 2 weeks ago
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆95Updated 4 years ago
- ☆275Updated last week
- Subset of BLAS routines optimized for NVIDIA GPUs☆76Updated 2 years ago
- Kernel Tuner☆381Updated this week
- A task benchmark☆44Updated last year
- ytopt: machine-learning-based autotuning and hyperparameter optimization framework using Bayesian Optimization☆49Updated this week
- High Performance Linpack for Next-Generation AMD HPC Accelerators☆65Updated last month
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆67Updated 2 weeks ago
- ☆49Updated 5 years ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆114Updated last week
- Information about many aspects of high-performance computing. Wiki content moved to ~/docs.☆312Updated last month
- 🎃 GPU load-balancing library for regular and irregular computations.☆66Updated 4 months ago
- RAJA Performance Portability Layer (C++)☆561Updated this week
- A proxy app for the Monte Carlo Transport Code, Mercury. LLNL-CODE-684037☆46Updated 2 years ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆260Updated last year