icl-utk-edu / papi
☆130Updated last week
Alternatives and similar repositories for papi:
Users that are interested in papi are comparing it to the libraries listed below
- Advanced Profiling and Analytics for AMD Hardware☆140Updated this week
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆130Updated this week
- A collection of performance analysis tools, recipes, handy scripts, microbenchmarks & more☆129Updated this week
- SYCL Benchmark Suite☆61Updated last week
- ☆233Updated last week
- Benchmark for measuring the performance of sparse and irregular memory access.☆76Updated last week
- GPUDirect Async support for IB Verbs☆100Updated 2 years ago
- Unified Collective Communication Library☆227Updated this week
- Rodinia benchmark☆170Updated last year
- A light-weight MPI profiler.☆87Updated 6 months ago
- ☆42Updated 4 years ago
- Provides a set of benchmarks that can be used to measure the memory bandwidth performance of CPU's☆84Updated 10 months ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆50Updated this week
- Forked from https://bitbucket.org/berkeleylab/cs-roofline-toolkit/src/master/☆19Updated 5 years ago
- The University of Bristol HPC Simulation Engine☆95Updated this week
- Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite☆64Updated 6 years ago
- High Performance Linpack for Next-Generation AMD HPC Accelerators☆46Updated 2 weeks ago
- oneAPI Collective Communications Library (oneCCL)☆222Updated 3 weeks ago
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆105Updated this week
- ☆228Updated last week
- SST Structural Simulation Toolkit Parallel Discrete Event Core and Services☆137Updated this week
- ☆59Updated 4 months ago
- TPP experimentation on MLIR for linear algebra☆119Updated this week
- Unofficial description of the CUDA assembly (SASS) instruction sets.☆59Updated 3 months ago
- tools to create performance and roofline plots from measured data☆58Updated 10 years ago
- Vendor-neutral library for exposing power and performance features across diverse architectures☆72Updated 4 months ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆104Updated 7 years ago
- The NAS Parallel Benchmarks for evaluating C++ parallel programming frameworks on shared-memory architectures☆50Updated 3 weeks ago
- ROC profiler library. Profiling with perf-counters and derived metrics.☆135Updated this week
- CUDA Flux is a profiler for GPU applications which reports the basic block executions frequencies of compute kernels☆32Updated 3 years ago