ROCm / rocmProfileData
☆19Updated last week
Alternatives and similar repositories for rocmProfileData:
Users that are interested in rocmProfileData are comparing it to the libraries listed below
- A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators☆76Updated last year
- ☆60Updated 2 months ago
- RCCL Performance Benchmark Tests☆59Updated last month
- rocWMMA☆101Updated this week
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆56Updated last week
- ☆19Updated 3 months ago
- Next generation SPARSE implementation for ROCm platform☆118Updated this week
- ROCm Tracer Callback/Activity Library for Performance tracing AMD GPUs☆79Updated last week
- LLVM/MLIR based compiler instrumentation of AMD GPU kernels☆17Updated last week
- Advanced Profiling and Analytics for AMD Hardware☆140Updated this week
- ROC profiler library. Profiling with perf-counters and derived metrics.☆135Updated last week
- ☆88Updated 10 months ago
- An extension library of WMMA API (Tensor Core API)☆90Updated 7 months ago
- amdgpu example code in hip/asm☆28Updated 2 weeks ago
- HPCG benchmark based on ROCm platform☆37Updated this week
- ☆137Updated this week
- ROCm BLAS marshalling library☆132Updated this week
- OpenAI Triton backend for Intel® GPUs☆165Updated this week
- ☆17Updated last year
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆104Updated 7 years ago
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆37Updated 7 months ago
- ☆20Updated last year
- Ahead of Time (AOT) Triton Math Library☆54Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆67Updated this week
- Stretching GPU performance for GEMMs and tensor contractions.☆233Updated this week
- ROCm SPARSE marshalling library☆67Updated this week
- ☆34Updated this week
- ☆20Updated this week
- High Performance Linpack for Next-Generation AMD HPC Accelerators☆46Updated 3 weeks ago