kennethdsheridan / rocm_gpu_tradecraftLinks
Commands that will make you more comfortable with the ROCm toolkit.
☆17Updated 10 months ago
Alternatives and similar repositories for rocm_gpu_tradecraft
Users that are interested in rocm_gpu_tradecraft are comparing it to the libraries listed below
Sorting:
- Test suite for probing the numerical behavior of NVIDIA tensor cores☆38Updated 10 months ago
- rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.☆86Updated last week
- ☆24Updated last month
- RCCL Performance Benchmark Tests☆67Updated 2 weeks ago
- A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators☆95Updated 2 weeks ago
- ☆46Updated this week
- A CUTLASS implementation using SYCL☆23Updated last week
- NVIDIA HPCG is based on the HPCG benchmark and optimized for performance on NVIDIA accelerated HPC systems.☆56Updated last month
- Advanced Profiling and Analytics for AMD Hardware☆156Updated this week
- Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note…☆61Updated 3 months ago
- An extension library of WMMA API (Tensor Core API)☆97Updated 10 months ago
- extensible collectives library in triton☆87Updated 2 months ago
- QuickReduce is a performant all-reduce library designed for AMD ROCm that supports inline compression.☆27Updated 2 months ago
- Bandwidth test for ROCm☆56Updated 2 weeks ago
- ROC profiler library. Profiling with perf-counters and derived metrics.☆147Updated last week
- ☆146Updated this week
- ROCm BLAS marshalling library☆142Updated this week
- Benchmark for measuring the performance of sparse and irregular memory access.☆78Updated last month
- ☆61Updated 5 months ago
- End to End steps for adding custom ops in PyTorch.☆23Updated 4 years ago
- rocWMMA☆114Updated last week
- ☆36Updated this week
- Dev repo for power measurement for the MLPerf™ benchmarks☆22Updated last month
- Development repository for the Triton language and compiler☆122Updated this week
- AMD HPC Research Fund Cloud☆13Updated 3 weeks ago
- ☆25Updated this week
- oneAPI Technical Advisory Board (TAB) Meeting Notes☆72Updated last year
- ROCm Parallel Primitives☆172Updated this week
- ☆18Updated last year
- AI Tensor Engine for ROCm☆201Updated this week