reger-men / HPL_GPULinks

High-Performance Linpack Benchmark adopted version for GPU backend

☆11

Alternatives and similar repositories for HPL_GPU

Users that are interested in HPL_GPU are comparing it to the libraries listed below

Sorting:

RIKEN-RCCS / hpl-ai
An HPL-AI implementation for Fugaku
☆21Updated 4 years ago
north-numerical-computing / tensor-cores-numerical-behavior
Test suite for probing the numerical behavior of NVIDIA tensor cores
☆40Updated last year
shixun404 / Fault-Tolerant-SGEMM-on-NVIDIA-GPUs
Anatomy of High-Performance GEMM with Online Fault Tolerance on GPUs
☆12Updated 4 months ago
KernelTuner / kernel_launcher
Using C++ magic to launch/capture CUDA kernels and tune them with Kernel Tuner
☆21Updated last year
ROCm / rocm_bandwidth_test
Bandwidth test for ROCm
☆63Updated this week
ROCm / rccl-tests
RCCL Performance Benchmark Tests
☆71Updated last week
ROCm / rocHPL
High Performance Linpack for Next-Generation AMD HPC Accelerators
☆60Updated 3 weeks ago
ROCm / rocSHMEM
rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.
☆98Updated this week
ekondis / gpuroofperf-toolkit
A GPU performance prediction toolkit for CUDA programs
☆17Updated 6 years ago
ROCm / pytorch-micro-benchmarking
☆20Updated last week
PanZaifeng / RecFlex
A recommendation model kernel optimizing system
☆10Updated 2 months ago
eth-cscs / Tiled-MM
Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.
☆32Updated 4 months ago
wudu98 / autoGEMM
☆12Updated 8 months ago
AMD-HPC / CoralGemm
☆13Updated 2 months ago
mlcommons / hpc
Reference implementations of MLPerf™ HPC training benchmarks
☆48Updated 5 months ago
cyanguwa / nersc-roofline
☆45Updated 4 years ago
arm-hpc-devkit / nvidia-arm-hpc-devkit-users-guide
Get started with your NVIDIA Arm HPC Developers Kit!
☆33Updated 2 years ago
ROCm / TransferBench
TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)
☆44Updated this week
ROCm / aws-ofi-rccl
☆16Updated 4 months ago
temporal-hpc / reduction-tensor-cores
Fast GPU based tensor core reductions
☆13Updated 2 years ago
zjin-lcf / Rodinia_SYCL
☆14Updated 4 years ago
wu-kan / HPL-AI
An implementation of HPL-AI Mixed-Precision Benchmark based on hpl-2.3
☆27Updated 4 years ago
HPCToolkit / hpctoolkit-tutorial-examples
CPU and GPU tutorial examples
☆13Updated 4 months ago
FZJ-JSC / jubench
JUPITER Benchmark Suite
☆19Updated 3 weeks ago
ROCm / rocHPCG
HPCG benchmark based on ROCm platform
☆38Updated last month
Jokeren / GPA
GPU Performance Advisor
☆65Updated 3 years ago
AI-HPC-Research-Team / AIPerf
Automated machine learning as an AI-HPC benchmark
☆66Updated 3 years ago
ROCm / roctracer
ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs
☆84Updated 2 weeks ago
intel / cutlass-sycl
A CUTLASS implementation using SYCL
☆32Updated this week
ROCm / hip-tests
☆37Updated last week