north-numerical-computing / tensor-cores-numerical-behavior
Test suite for probing the numerical behavior of NVIDIA tensor cores
☆36Updated 5 months ago
Alternatives and similar repositories for tensor-cores-numerical-behavior:
Users that are interested in tensor-cores-numerical-behavior are comparing it to the libraries listed below
- ☆81Updated 8 months ago
- An extension library of WMMA API (Tensor Core API)☆87Updated 6 months ago
- Dissecting NVIDIA GPU Architecture☆82Updated 2 years ago
- ☆66Updated 3 weeks ago
- ☆46Updated 5 years ago
- Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)☆122Updated 4 years ago
- ☆16Updated 5 years ago
- A Winograd Minimal Filter Implementation in CUDA☆23Updated 3 years ago
- ☆40Updated 4 years ago
- ☆38Updated 4 years ago
- rocWMMA☆97Updated this week
- Code for paper "Design Principles for Sparse Matrix Multiplication on the GPU" accepted to Euro-Par 2018☆72Updated 4 years ago
- ☆131Updated this week
- ☆15Updated this week
- GPU Performance Advisor☆63Updated 2 years ago
- PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.☆103Updated last month
- A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators☆71Updated last year
- Automatic Mapping Generation, Verification, and Exploration for ISA-based Spatial Accelerators☆106Updated 2 years ago
- ☆38Updated 3 years ago
- SparseTIR: Sparse Tensor Compiler for Deep Learning☆133Updated last year
- PyTorch emulation library for Microscaling (MX)-compatible data formats☆187Updated 3 months ago
- Fast GPU based tensor core reductions☆13Updated 2 years ago
- OpenAI Triton backend for Intel® GPUs☆154Updated this week
- CUDA Matrix Multiplication Optimization☆152Updated 5 months ago
- GEMM and Winograd based convolutions using CUTLASS☆26Updated 4 years ago
- A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores☆47Updated last year
- Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.☆54Updated 4 months ago
- collection of benchmarks to measure basic GPU capabilities☆280Updated 2 weeks ago
- ☆32Updated 2 years ago
- ☆178Updated 6 months ago