amdgpu example code in hip/asm
☆56Mar 2, 2026Updated this week
Alternatives and similar repositories for gcnasm
Users that are interested in gcnasm are comparing it to the libraries listed below
Sorting:
- A tool for generating information about the matrix multiplication instructions in AMD Radeon™ and AMD Instinct™ accelerators☆127Nov 14, 2025Updated 3 months ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo. NOTE: develop branch is maintained as a read-only mirror☆523Updated this week
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆17Feb 9, 2026Updated 3 weeks ago
- not infinite, but huge canvas collaborative vector drawing program☆13Oct 28, 2018Updated 7 years ago
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆145Feb 23, 2026Updated last week
- ☆18Mar 12, 2025Updated 11 months ago
- My submission for the GPUMODE/AMD fp8 mm challenge☆29Jun 4, 2025Updated 9 months ago
- A collection of examples for the ROCm software stack☆280Updated this week
- ☆30Updated this week
- Implement asm gemm on vega64 for 4096x4096 fp32 matrix☆22Oct 12, 2019Updated 6 years ago
- ☆21Mar 22, 2021Updated 4 years ago
- [DEPRECATED] Moved to ROCm/rocm-systems repo☆165Feb 16, 2026Updated 2 weeks ago
- Line segment rasterization with pixel-perfect clipping.☆23Jul 28, 2025Updated 7 months ago
- ☆31Feb 25, 2026Updated last week
- ☆132Aug 14, 2025Updated 6 months ago
- ☆116Updated this week
- ☆112Apr 19, 2024Updated last year
- rdma编程学习☆25Dec 6, 2021Updated 4 years ago
- Derived from Nemes' gpuperftests☆33Jul 11, 2024Updated last year
- The translator that supports translating NVPTX to SPIR-V. This translator is modified from LLVM-SPIR-V Translator.☆44Oct 25, 2021Updated 4 years ago
- Utilities for accessing AMD's Machine-Readable GPU ISA Specifications.☆46Sep 24, 2025Updated 5 months ago
- ☆53Feb 24, 2026Updated last week
- collection of benchmarks to measure basic GPU capabilities☆498Oct 24, 2025Updated 4 months ago
- HealthiVert-GAN, a novel deep-learning framework designed to generate pseudo-healthy vertebral images. These images simulate the pre-frac…☆11Nov 3, 2025Updated 4 months ago
- Use yolov5 to realize the road occupation operation and vehicle parking violation detection in urban streets, and can independently delin…☆12Jan 2, 2023Updated 3 years ago
- All Resources from Stanford CS106B 2021☆24Jul 11, 2025Updated 7 months ago
- A robust, open-source physical layer implementation for FPGA-to-FPGA communication over high-speed serial links of the Quantum Error Corr…☆28Mar 2, 2026Updated last week
- ☆45May 4, 2025Updated 10 months ago
- Medium Access Control layer of 802.15.4☆13Nov 14, 2014Updated 11 years ago
- Community maintained hardware plugin for vLLM on AWS Neuron☆24Feb 26, 2026Updated last week
- Programmatically generated PCB libraries facilitating robust electronic product design.☆17Dec 15, 2025Updated 2 months ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 5 months ago
- Open-source library for Graph Streaming. Solves the connected components problem using sub-linear space. Published in SIGMOD'22.☆10Nov 13, 2025Updated 3 months ago
- Fast and Furious AMD Kernels☆372Feb 26, 2026Updated last week
- ☆48Dec 11, 2020Updated 5 years ago
- 使用 cutlass 实现 flash-attention 精简版,具有教学意义☆58Aug 12, 2024Updated last year
- AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming☆179Updated this week
- ☆11May 20, 2022Updated 3 years ago
- ☆11Nov 14, 2023Updated 2 years ago