amd / fuzzyHSALinks
☆52Updated last year
Alternatives and similar repositories for fuzzyHSA
Users that are interested in fuzzyHSA are comparing it to the libraries listed below
Sorting:
- ☆63Updated last year
- ☆448Updated 7 months ago
- Super fast FP32 matrix multiplication on RDNA3☆79Updated 7 months ago
- chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.☆305Updated this week
- ☆22Updated 3 weeks ago
- Repository of model demos using TT-Buda☆63Updated 7 months ago
- ☆162Updated 2 weeks ago
- Fast and Furious AMD Kernels☆110Updated this week
- Schola is a plugin for enabling Reinforcement Learning (RL) in Unreal Engine. It provides tools to help developers create environments, d…☆59Updated last month
- RDNA3 emulator☆54Updated 7 months ago
- Onboarding documentation source for the AMD Ryzen™ AI Software Platform. The AMD Ryzen™ AI Software Platform enables developers to take…☆87Updated this week
- asynchronous/distributed speculative evaluation for llama3☆38Updated last year
- High-Performance SGEMM on CUDA devices☆110Updated 9 months ago
- Custom PTX Instruction Benchmark☆134Updated 8 months ago
- LLM training in simple, raw C/HIP for AMD GPUs☆54Updated last year
- OpenCL/SPIR-V implementation of HIP☆105Updated 3 years ago
- Samples of good AI generated CUDA kernels☆91Updated 5 months ago
- ☆45Updated last month
- Tensor Tiling Library☆38Updated last month
- ☆153Updated this week
- Make PyTorch models at least run on APUs.☆57Updated last year
- ☆48Updated 2 years ago
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- Gpu benchmark☆72Updated 9 months ago
- General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …☆52Updated 8 months ago
- Nvidia Instruction Set Specification Generator☆298Updated last year
- A collection of examples for the ROCm software stack☆253Updated this week
- llama.cpp fork used by GPT4All☆56Updated 8 months ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆114Updated this week
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆73Updated 9 months ago