microsoft / ArchProbeLinks
A profiler to disclose and quantify hardware features on GPUs.
☆169Updated 3 years ago
Alternatives and similar repositories for ArchProbe
Users that are interested in ArchProbe are comparing it to the libraries listed below
Sorting:
- A micro Vulkan compute pipeline and a collection of benchmarking compute shaders☆239Updated 2 months ago
- ☆146Updated this week
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.☆131Updated last year
- rocWMMA☆114Updated this week
- An extension library of WMMA API (Tensor Core API)☆97Updated 10 months ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆105Updated 7 years ago
- ☆61Updated 5 months ago
- Assembler for NVIDIA Volta and Turing GPUs☆218Updated 3 years ago
- Training material for Nsight developer tools☆157Updated 9 months ago
- Stretching GPU performance for GEMMs and tensor contractions.☆242Updated last week
- A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators☆95Updated 2 weeks ago
- ☆96Updated last year
- AMD's graph optimization engine.☆220Updated this week
- Intercept Layer for Debugging and Analyzing OpenCL Applications☆328Updated last week
- ☆151Updated this week
- Advanced Profiling and Analytics for AMD Hardware☆156Updated this week
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆84Updated last year
- ☆44Updated 4 years ago
- CUDA Matrix Multiplication Optimization☆188Updated 10 months ago
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆502Updated 2 years ago
- Conversion to/from half-precision floating point formats☆354Updated 10 months ago
- ☆249Updated this week
- ☆311Updated 5 months ago
- TPP experimentation on MLIR for linear algebra☆131Updated this week
- Shared Middle-Layer for Triton Compilation☆251Updated this week
- Dissecting NVIDIA GPU Architecture☆95Updated 2 years ago
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆135Updated this week
- amdgpu example code in hip/asm☆32Updated 2 weeks ago
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆329Updated this week
- portDNN is a library implementing neural network algorithms written using SYCL☆113Updated last year