corsix / amxLinks
Apple AMX Instruction Set
☆1,168Updated 10 months ago
Alternatives and similar repositories for amx
Users that are interested in amx are comparing it to the libraries listed below
Sorting:
- Apple G13 GPU architecture docs and tools☆624Updated 6 months ago
- Apple GPU microarchitecture☆560Updated last year
- Exploring the scalable matrix extension of the Apple M4 processor☆209Updated last year
- Reverse engineered Linux driver for the Apple Neural Engine (ANE).☆434Updated last year
- Apple Firestorm/Icestorm CPU microarchitecture docs☆246Updated 2 years ago
- Everything we actually know about the Apple Neural Engine (ANE)☆2,315Updated 3 weeks ago
- ☆305Updated last month
- Dissecting the M1's GPU for 3D acceleration☆1,015Updated 3 years ago
- ☆448Updated 7 months ago
- Nvidia Instruction Set Specification Generator☆298Updated last year
- ☆1,066Updated 6 months ago
- Exocompilation for productive programming of hardware accelerators☆682Updated 2 weeks ago
- MLIR For Beginners tutorial☆1,137Updated 4 months ago
- Sniff CUDA ioctls☆216Updated 2 years ago
- Solve Puzzles. Learn Metal 🤘☆591Updated last year
- GPUOcelot: A dynamic compilation framework for PTX☆212Updated 9 months ago
- The fastest RISC-V sandbox☆959Updated last week
- Infrastructure for Machine Learning Guided Optimization (MLGO) in LLVM.☆737Updated this week
- Running linear algebra as fast as possible on Apple silicon☆27Updated 2 years ago
- nsync is a C library that exports various synchronization primitives, such as mutexes☆1,234Updated 2 weeks ago
- A new (MLIR based) high-level IR for clang.☆547Updated last week
- BLIS fork with kernels for Apple M1. (Perhaps) The first open-source BLAS with Apple Matrix Coprocessor support.☆36Updated 2 years ago
- Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library☆1,698Updated 7 months ago
- Simple examples of Assembly code for the Apple Silicon (M1) CPU☆85Updated last month
- ☆78Updated this week
- advanced compilers☆878Updated last week
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆367Updated 6 months ago
- Implementations of SIMD instruction sets for systems which don't natively support them.☆2,851Updated 3 weeks ago
- ☆295Updated last year
- Measures the latency between CPU cores☆1,286Updated last year