amd / fuzzyHSA
☆54Updated 10 months ago
Alternatives and similar repositories for fuzzyHSA:
Users that are interested in fuzzyHSA are comparing it to the libraries listed below
- ☆58Updated 10 months ago
- chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.☆269Updated this week
- Onboarding documentation source for the AMD Ryzen™ AI Software Platform. The AMD Ryzen™ AI Software Platform enables developers to take…☆62Updated this week
- Schola is a plugin for enabling Reinforcement Learning (RL) in Unreal Engine. It provides tools to help developers create environments, d…☆37Updated last month
- The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm☆66Updated this week
- Super fast FP32 matrix multiplication on RDNA3☆51Updated last month
- RDNA3 emulator☆54Updated 3 weeks ago
- ☆116Updated 2 weeks ago
- High-Performance SGEMM on CUDA devices☆90Updated 3 months ago
- ☆92Updated last week
- Tensor Tiling Library☆36Updated 3 weeks ago
- Derived from Nemes' gpuperftests☆30Updated 9 months ago
- ☆131Updated this week
- rocWMMA☆110Updated this week
- LLM training in simple, raw C/HIP for AMD GPUs☆48Updated 7 months ago
- asynchronous/distributed speculative evaluation for llama3☆39Updated 9 months ago
- OpenCL/SPIR-V implementation of HIP☆104Updated 2 years ago
- ☆444Updated last month
- Custom PTX Instruction Benchmark☆123Updated 2 months ago
- AI Tensor Engine for ROCm☆190Updated this week
- An implementation of HIP that works on CPUs, across OSes.☆116Updated last year
- ☆19Updated this week
- ☆226Updated 2 weeks ago
- hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditiona…☆94Updated this week
- llama.cpp fork used by GPT4All☆55Updated 2 months ago
- User-Mode Driver for Tenstorrent hardware☆20Updated this week
- Bandwidth test for ROCm☆54Updated this week
- ROCm BLAS marshalling library☆140Updated this week
- Tenstorrent's MLIR Based Compiler. We aim to enable developers to run AI on all configurations of Tenstorrent hardware, through an open-s…☆47Updated this week
- A faithful clone of Karpathy's llama2.c (one file inference, zero dependency) but fully functional with LLaMA 3 8B base and instruct mode…☆126Updated 9 months ago