Qazalin / remuLinks
RDNA3 emulator
☆54Updated 5 months ago
Alternatives and similar repositories for remu
Users that are interested in remu are comparing it to the libraries listed below
Sorting:
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- Nvidia Instruction Set Specification Generator☆295Updated last year
- ☆73Updated 2 weeks ago
- tiny code to access tenstorrent blackhole☆59Updated 4 months ago
- Tenstorrent system interface library☆31Updated last week
- FP4 MAC Array☆19Updated last year
- tenstorrent kernel from twitch☆28Updated last year
- Tenstorrent console based hardware information program☆53Updated last week
- Custom PTX Instruction Benchmark☆129Updated 7 months ago
- ☆448Updated 6 months ago
- Tenstorrent MLIR compiler☆187Updated this week
- Repo for AI Compiler team. The intended purpose of this repo is for implementation of a PJRT device.☆34Updated this week
- Sniff CUDA ioctls☆212Updated 2 years ago
- Tenstorrent's MLIR Based Compiler. We aim to enable developers to run AI on all configurations of Tenstorrent hardware, through an open-s…☆122Updated this week
- Super fast FP32 matrix multiplication on RDNA3☆75Updated 6 months ago
- GPUOcelot: A dynamic compilation framework for PTX☆210Updated 8 months ago
- Exocompilation for productive programming of hardware accelerators☆672Updated this week
- Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator☆214Updated last year
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆115Updated this week
- A minimal Tensor Processing Unit (TPU) inspired by Google's TPUv1.☆184Updated last year
- Exploring the scalable matrix extension of the Apple M4 processor☆206Updated 11 months ago
- Run 64-bit Linux on LiteX + RocketChip☆202Updated 2 months ago
- ☆25Updated this week
- The missing pieces (as far as boilerplate reduction goes) of the upstream MLIR python bindings.☆109Updated last week
- Attention in SRAM on Tenstorrent Grayskull☆38Updated last year
- The Finite Field Assembly Programming Language☆36Updated 4 months ago
- ⭐️ TTNN Compiler for PyTorch 2 ⭐️ Enables running PyTorch models on Tenstorrent hardware using eager or compile path☆57Updated last week
- This project aims to enable language model inference on FPGAs, supporting AI applications in edge devices and environments with limited r…☆167Updated last year
- MLIR-based partitioning system☆137Updated this week
- High-Performance SGEMM on CUDA devices☆107Updated 8 months ago