olcf-tutorials / local_mpi_to_gpu
How to use node-local MPI rank IDs to manually map MPI ranks to GPUs
☆14Updated 5 years ago
Alternatives and similar repositories for local_mpi_to_gpu:
Users that are interested in local_mpi_to_gpu are comparing it to the libraries listed below
- Distributed View Extension for Kokkos☆45Updated 4 months ago
- OpenMP vs Offload☆21Updated last year
- Logger for MPI communication☆26Updated last year
- QMCPACK miniapp: a simplified real space QMC code for algorithm development, performance portability testing, and computer science experi…☆27Updated 9 months ago
- Molecular dynamics proxy application based on Kokkos☆33Updated 9 months ago
- MiniMD Molecular Dynamics Mini-App☆50Updated last month
- This aims to be an wrapper to C-MPI3 for C++, using the principles of simplicity, STL, RAII and Boost and enforcing type-safety. This i…☆21Updated 6 months ago
- ☆12Updated last year
- Parallel Computing -- Validation Suite: Validation engine for Exascale project benchmarks☆14Updated last year
- CPE change log and release notes☆26Updated 7 months ago
- A compression benchmark suite☆17Updated last year
- ALCF Computational Performance Workshop☆37Updated 2 years ago
- Comb is a communication performance benchmarking tool.☆24Updated 2 years ago
- Very-Low Overhead Checkpointing System☆57Updated 3 months ago
- Training examples for SYCL☆40Updated last week
- Benchmark implementation of CosmoFlow in TensorFlow Keras☆21Updated last year
- ☆30Updated 11 months ago
- PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core …☆56Updated this week
- A source-to-source translator for OpenACC to OpenMP.☆16Updated 3 years ago
- A proxy app for the Monte Carlo Transport Code, Mercury. LLNL-CODE-684037☆40Updated last year
- An open collaborative repository for reproducible specifications of HPC benchmarks and cross site benchmarking environments☆38Updated this week
- ☆10Updated last month
- Tensor Algebra Library Routines for Shared Memory Systems☆38Updated last year
- A benchmark suite for measuring HDF5 performance.☆40Updated 8 months ago
- CPU and GPU tutorial examples☆13Updated 3 weeks ago
- DPLASMA is a highly optimized, accelerator-aware, implementation of a dense linear algebra package for distributed heterogeneous systems…☆12Updated 3 weeks ago
- ☆82Updated this week
- A repository of codelabs and tutorials to support education in scientific computing☆27Updated last year
- This tutorial demonstrates how to use CUDA-Aware MPI☆38Updated last year
- The Task-Aware MPI (TAMPI) library extends the functionality of standard MPI libraries by providing new mechanisms for improving the inte…☆24Updated 5 months ago