corsix / amxLinks
Apple AMX Instruction Set
☆1,086Updated 5 months ago
Alternatives and similar repositories for amx
Users that are interested in amx are comparing it to the libraries listed below
Sorting:
- Apple G13 GPU architecture docs and tools☆588Updated 2 weeks ago
- Exploring the scalable matrix extension of the Apple M4 processor☆176Updated 6 months ago
- Apple GPU microarchitecture☆522Updated 8 months ago
- Apple Firestorm/Icestorm CPU microarchitecture docs☆241Updated last year
- Reverse engineered Linux driver for the Apple Neural Engine (ANE).☆412Updated last year
- Reverse engineering Rosetta 2 on M1 Mac☆402Updated 3 years ago
- Nvidia Instruction Set Specification Generator☆267Updated 10 months ago
- ☆283Updated 5 months ago
- Dissecting the M1's GPU for 3D acceleration☆1,006Updated 3 years ago
- ☆443Updated last month
- Exocompilation for productive programming of hardware accelerators☆607Updated this week
- ☆1,041Updated 2 weeks ago
- GPUOcelot: A dynamic compilation framework for PTX☆192Updated 3 months ago
- C/C++ frontend for MLIR. Also features polyhedral optimizations, parallel optimizations, and more!☆545Updated this week
- A new (MLIR based) high-level IR for clang.☆499Updated this week
- ☆296Updated last year
- The fastest RISC-V sandbox☆865Updated this week
- Everything we actually know about the Apple Neural Engine (ANE)☆2,212Updated 2 months ago
- MLIR For Beginners tutorial☆985Updated 3 months ago
- An introduction to ARM64 assembly on Apple Silicon Macs☆4,676Updated 2 months ago
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆502Updated 2 years ago
- HIPIFY: Convert CUDA to Portable C++ Code☆580Updated this week
- TT-NN operator library, and TT-Metalium low level kernel programming model.☆888Updated this week
- C++ template library for high performance SIMD based sorting algorithms☆944Updated 2 weeks ago
- Automatic verification of LLVM optimizations☆911Updated 2 weeks ago
- FlashAttention (Metal Port)☆487Updated 8 months ago
- Sniff CUDA ioctls☆192Updated 2 years ago
- ctypes wrappers for HIP, CUDA, and OpenCL☆129Updated 11 months ago
- advanced compilers☆833Updated this week
- The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.☆1,545Updated this week