corsix / amxLinks
Apple AMX Instruction Set
☆1,185Updated last year
Alternatives and similar repositories for amx
Users that are interested in amx are comparing it to the libraries listed below
Sorting:
- Apple G13 GPU architecture docs and tools☆636Updated 8 months ago
- Apple GPU microarchitecture☆572Updated last year
- Exploring the scalable matrix extension of the Apple M4 processor☆216Updated last year
- Apple Firestorm/Icestorm CPU microarchitecture docs☆250Updated 2 years ago
- Reverse engineered Linux driver for the Apple Neural Engine (ANE).☆449Updated last year
- Dissecting the M1's GPU for 3D acceleration☆1,017Updated 3 years ago
- ☆310Updated 3 months ago
- Nvidia Instruction Set Specification Generator☆310Updated last year
- Everything we actually know about the Apple Neural Engine (ANE)☆2,350Updated 3 months ago
- ☆451Updated 9 months ago
- Exocompilation for productive programming of hardware accelerators☆699Updated this week
- Sniff CUDA ioctls☆222Updated 2 years ago
- ☆1,074Updated 8 months ago
- GPUOcelot: A dynamic compilation framework for PTX☆219Updated 11 months ago
- Circuit IR Compilers and Tools☆1,998Updated this week
- BLIS fork with kernels for Apple M1. (Perhaps) The first open-source BLAS with Apple Matrix Coprocessor support.☆36Updated 3 years ago
- A new (MLIR based) high-level IR for clang.☆580Updated this week
- Infrastructure for Machine Learning Guided Optimization (MLGO) in LLVM.☆746Updated last week
- The fastest RISC-V sandbox☆1,014Updated 2 weeks ago
- MLIR For Beginners tutorial☆1,199Updated 6 months ago
- ☆296Updated last year
- Measures the latency between CPU cores☆1,312Updated last year
- nsync is a C library that exports various synchronization primitives, such as mutexes☆1,240Updated 2 months ago
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- FlashAttention (Metal Port)☆573Updated last year
- It's a core. Made on Twitch.☆266Updated 4 years ago
- Solve Puzzles. Learn Metal 🤘☆597Updated last year
- ☆87Updated this week
- Optimized implementations of various library functions for ARM architecture processors☆682Updated last week
- Running linear algebra as fast as possible on Apple silicon☆28Updated 2 years ago