Apple AMX Instruction Set
☆1,196Dec 26, 2024Updated last year
Alternatives and similar repositories for amx
Users that are interested in amx are comparing it to the libraries listed below
Sorting:
- Apple G13 GPU architecture docs and tools☆644May 16, 2025Updated 9 months ago
- Rust wrapper for Apple Matrix Coprocessor (AMX) instructions☆53Nov 14, 2023Updated 2 years ago
- Apple GPU microarchitecture☆578Sep 22, 2024Updated last year
- Exploring the scalable matrix extension of the Apple M4 processor☆222Nov 7, 2024Updated last year
- Apple Firestorm/Icestorm CPU microarchitecture docs☆251Jul 13, 2023Updated 2 years ago
- Everything we actually know about the Apple Neural Engine (ANE)☆2,364Oct 21, 2025Updated 4 months ago
- An introduction to ARM64 assembly on Apple Silicon Macs☆4,904Nov 19, 2025Updated 3 months ago
- Running linear algebra as fast as possible on Apple silicon☆28Aug 18, 2023Updated 2 years ago
- ☆33Mar 31, 2025Updated 11 months ago
- Performance-portable, length-agnostic SIMD with runtime dispatch☆5,346Updated this week
- tiniest x86-64-linux emulator☆7,449Dec 10, 2025Updated 2 months ago
- mold: A Modern Linker 🦠☆16,209Updated this week
- Kernel extension that enables TSO for Apple silicon processes☆265Jun 18, 2023Updated 2 years ago
- Implementations of SIMD instruction sets for systems which don't natively support them.☆2,967Feb 23, 2026Updated last week
- BLIS fork with kernels for Apple M1. (Perhaps) The first open-source BLAS with Apple Matrix Coprocessor support.☆36Jan 7, 2023Updated 3 years ago
- A bootloader and experimentation playground for Apple Silicon☆3,999Feb 24, 2026Updated last week
- Reverse engineered Linux driver for the Apple Neural Engine (ANE).☆459Mar 12, 2024Updated last year
- ☆313Sep 25, 2025Updated 5 months ago
- RSD: RISC-V Out-of-Order Superscalar Processor☆1,152Feb 21, 2026Updated last week
- Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)☆2,671Apr 25, 2023Updated 2 years ago
- Dissecting the M1's GPU for 3D acceleration☆1,020Apr 4, 2022Updated 3 years ago
- A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation☆1,490Feb 16, 2026Updated 2 weeks ago
- Tool for messing around with Apple GPU assembly☆27Jan 24, 2021Updated 5 years ago
- ☆1,506Jul 22, 2022Updated 3 years ago
- A collection of reverse engineered Apple things, as well as a machine-readable database of Apple hardware☆1,286Jan 10, 2026Updated last month
- You like pytorch? You like micrograd? You love tinygrad! ❤️☆31,471Updated this week
- FlashAttention (Metal Port)☆589Sep 22, 2024Updated last year
- A superoptimizer for LLVM IR☆2,350Aug 28, 2024Updated last year
- Reverse engineering Rosetta 2 on M1 Mac☆426Aug 3, 2021Updated 4 years ago
- Circuit IR Compilers and Tools☆2,044Feb 26, 2026Updated last week
- Decompiling macOS Hypervisor.framework by hand☆134Sep 13, 2022Updated 3 years ago
- iPhone 11 emulated on QEMU☆2,189Oct 22, 2022Updated 3 years ago
- Measures the latency between CPU cores☆1,327Aug 13, 2024Updated last year
- ☆33Feb 9, 2026Updated 3 weeks ago
- mimalloc is a compact general purpose allocator with excellent performance.☆12,547Feb 6, 2026Updated 3 weeks ago
- MLX: An array framework for Apple silicon☆24,066Feb 26, 2026Updated last week
- Basic SAT model of x86 instructions using Z3, autogenerated from Intel docs☆321Dec 1, 2021Updated 4 years ago
- Extract Metal functions from .metallib files.☆177May 24, 2023Updated 2 years ago
- LZBITMAP compression library☆54Jan 18, 2023Updated 3 years ago