corsix / amxLinks
Apple AMX Instruction Set
☆1,161Updated 10 months ago
Alternatives and similar repositories for amx
Users that are interested in amx are comparing it to the libraries listed below
Sorting:
- Apple G13 GPU architecture docs and tools☆620Updated 5 months ago
- Apple GPU microarchitecture☆556Updated last year
- Exploring the scalable matrix extension of the Apple M4 processor☆208Updated 11 months ago
- Reverse engineered Linux driver for the Apple Neural Engine (ANE).☆430Updated last year
- Apple Firestorm/Icestorm CPU microarchitecture docs☆245Updated 2 years ago
- ☆302Updated last month
- Everything we actually know about the Apple Neural Engine (ANE)☆2,299Updated last week
- ☆448Updated 6 months ago
- Nvidia Instruction Set Specification Generator☆297Updated last year
- Dissecting the M1's GPU for 3D acceleration☆1,014Updated 3 years ago
- Sniff CUDA ioctls☆215Updated 2 years ago
- ☆1,062Updated 5 months ago
- Exocompilation for productive programming of hardware accelerators☆676Updated this week
- Kernel extension that enables TSO for Apple silicon processes☆263Updated 2 years ago
- GPUOcelot: A dynamic compilation framework for PTX☆210Updated 8 months ago
- Infrastructure for Machine Learning Guided Optimization (MLGO) in LLVM.☆728Updated last month
- The fastest RISC-V sandbox☆944Updated 2 weeks ago
- Measures the latency between CPU cores☆1,274Updated last year
- ☆294Updated last year
- An introduction to ARM64 assembly on Apple Silicon Macs☆4,814Updated 7 months ago
- BLIS fork with kernels for Apple M1. (Perhaps) The first open-source BLAS with Apple Matrix Coprocessor support.☆35Updated 2 years ago
- MLIR For Beginners tutorial☆1,115Updated 3 months ago
- Solve Puzzles. Learn Metal 🤘☆588Updated last year
- C++ template library for high performance SIMD based sorting algorithms☆979Updated last month
- TT-NN operator library, and TT-Metalium low level kernel programming model.☆1,233Updated last week
- FlashAttention (Metal Port)☆545Updated last year
- A new (MLIR based) high-level IR for clang.☆544Updated last week
- nsync is a C library that exports various synchronization primitives, such as mutexes☆1,220Updated last month
- Simple examples of Assembly code for the Apple Silicon (M1) CPU☆82Updated 3 weeks ago
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆548Updated 2 years ago