icl-utk-edu / papi
☆114Updated this week
Related projects ⓘ
Alternatives and complementary repositories for papi
- Benchmark for measuring the performance of sparse and irregular memory access.☆75Updated this week
- Advanced Profiling and Analytics for AMD Hardware☆137Updated this week
- Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite☆60Updated 6 years ago
- ☆41Updated 4 years ago
- The University of Bristol HPC Simulation Engine☆93Updated last week
- Provides a set of benchmarks that can be used to measure the memory bandwidth performance of CPU's☆81Updated 7 months ago
- Magnum IO community repo☆79Updated 5 months ago
- The NAS Parallel Benchmarks for evaluating C++ parallel programming frameworks on shared-memory architectures☆46Updated last month
- SST Structural Simulation Toolkit Parallel Discrete Event Core and Services☆132Updated last week
- ROC profiler library. Profiling with perf-counters and derived metrics.☆132Updated this week
- A benchmarking suite for heterogeneous systems. The primary goal of this project is to improve and update aspects of existing benchmarkin…☆40Updated 8 months ago
- Unified Collective Communication Library☆207Updated last week
- tools to create performance and roofline plots from measured data☆58Updated 10 years ago
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆31Updated 3 years ago
- Forked from https://bitbucket.org/berkeleylab/cs-roofline-toolkit/src/master/☆17Updated 5 years ago
- Measure instruction latency and throughput☆22Updated 2 years ago
- GPUDirect Async support for IB Verbs☆90Updated 2 years ago
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆29Updated 2 months ago
- ☆224Updated 2 months ago
- Bandwidth test for ROCm☆49Updated this week
- High Performance Linpack for Next-Generation AMD HPC Accelerators☆43Updated last month
- Instanciate the Cache Aware Roofline Model on single socket and multisocket systems.☆27Updated 5 years ago
- ROC_SHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆41Updated last year
- A collection of performance analysis tools, recipes, handy scripts, microbenchmarks & more☆117Updated this week
- ☆218Updated last week
- ☆80Updated 7 months ago
- Loop Kernel Analysis and Performance Modeling Toolkit☆89Updated 2 months ago
- SYCL Benchmark Suite☆56Updated 2 months ago
- ☆13Updated 2 months ago
- This serves as a repository for reproducibility of the SC21 paper "In-Depth Analyses of Unified Virtual Memory System for GPU Accelerated…☆27Updated last year