Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.
☆19Feb 9, 2026Updated 2 months ago
Alternatives and similar repositories for fp8-quant-matmul
Users that are interested in fp8-quant-matmul are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- My submission for the GPUMODE/AMD fp8 mm challenge☆29Jun 4, 2025Updated 10 months ago
- ☆12Dec 22, 2024Updated last year
- ☆23Apr 7, 2026Updated last week
- ☆18Jun 6, 2025Updated 10 months ago
- AMD-SHARK Inference Modeling and Serving☆62Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Personal solutions to the Triton Puzzles☆20Jul 18, 2024Updated last year
- General Matrix Multiplication using NVIDIA Tensor Cores☆28Jan 25, 2025Updated last year
- ☆32Jul 2, 2025Updated 9 months ago
- A high-performance attention mechanism that computes softmax normalization in a single streaming pass using running accumulators (online …