parallelcodefoundry / ParEvalLinks
Parallel Code Evaluation Benchmark
☆38Updated this week
Alternatives and similar repositories for ParEval
Users that are interested in ParEval are comparing it to the libraries listed below
Sorting:
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆123Updated this week
- A hierarchical collective communications library with portable optimizations☆36Updated 11 months ago
- Scripts for fine-tuning an HPC Code LLM☆14Updated last year
- Reference implementations of MLPerf™ HPC training benchmarks☆49Updated 8 months ago
- This repo contains the dataset for paper: Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code☆15Updated last year
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆314Updated last week
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆165Updated this week
- GPU Performance Advisor☆65Updated 3 years ago
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆32Updated 4 years ago
- ☆10Updated 7 months ago
- JUPITER Benchmark Suite☆21Updated 3 months ago
- ytopt: machine-learning-based autotuning and hyperparameter optimization framework using Bayesian Optimization☆50Updated last week
- ☆16Updated 7 months ago
- The Hardware Sampling (hws) library can be used to track hardware performance like clock frequency, memory usage, temperatures, or power …☆18Updated 6 months ago
- Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH)☆112Updated 2 years ago
- Prototype of OpenSHMEM for NVIDIA GPUs, developed as part of DoE Design Forward☆25Updated 7 years ago
- A light-weight MPI profiler.☆102Updated last month
- Using C++ magic to capture CUDA kernels and tune them with Kernel Tuner☆21Updated last month
- PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core …☆72Updated 2 weeks ago
- High Performance Linpack for Next-Generation AMD HPC Accelerators☆63Updated last week
- XSBench: The Monte Carlo Macroscopic Cross Section Lookup Benchmark☆84Updated last year
- This is the public repo for the MLPerf DeepCAM climate data segmentation proposal.☆16Updated last month
- Multi-GPU communication profiler and visualizer☆36Updated last year
- Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite☆66Updated 7 years ago
- A Micro-benchmarking Tool for HPC Networks☆32Updated 2 months ago
- GVProf: A Value Profiler for GPU-based Clusters☆52Updated last year
- NUMA-aware multi-CPU multi-GPU data transfer benchmarks☆25Updated 2 years ago
- RCCL Performance Benchmark Tests☆78Updated last week
- SYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs☆49Updated this week
- CUDA GPU Benchmark☆35Updated 9 months ago