Exploring the scalable matrix extension of the Apple M4 processor
☆231Nov 7, 2024Updated last year
Alternatives and similar repositories for m4-sme-exploration
Users that are interested in m4-sme-exploration are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Apple AMX Instruction Set☆1,213Dec 26, 2024Updated last year
- CPU micro benchmarks☆80Apr 14, 2026Updated 2 weeks ago
- ☆10Apr 24, 2023Updated 3 years ago
- Utility to sign DXIL code after compilation☆22Feb 18, 2019Updated 7 years ago
- Apple GPU microarchitecture☆596Sep 22, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆54Oct 31, 2021Updated 4 years ago
- This repository is a read-only mirror of https://gitlab.arm.com/kleidi/kleidiai☆137Apr 22, 2026Updated last week
- Everything we actually know about the Apple Neural Engine (ANE)☆2,459Mar 12, 2026Updated last month
- Investigation into replacing the MES compiler☆30Updated this week
- Apple Firestorm/Icestorm CPU microarchitecture docs☆256Jul 13, 2023Updated 2 years ago
- FP64 equivalent GEMM by the Ozaki scheme with Int8 Tensor Cores☆120Dec 2, 2025Updated 4 months ago
- ☆82Oct 29, 2024Updated last year
- An easy-to-use and fast library for task-based parallelism, utilizing coroutines.☆333Sep 13, 2024Updated last year
- ☆313Updated this week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Cuda-Based Software Rasterization for Billions of Triangles☆131Updated this week
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆949Mar 18, 2026Updated last month
- IREE's PyTorch Frontend, based on Torch Dynamo.☆109Apr 21, 2026Updated last week
- Interactive GUI Snowfall Simulation Created in C & Raylib☆23Dec 24, 2025Updated 4 months ago
- Nvidia Instruction Set Specification Generator☆321Jul 9, 2024Updated last year
- A Throughput-Optimized Pipeline Parallel Inference System for Large Language Models☆49Dec 24, 2025Updated 4 months ago
- Reproducibility package for "Robust Join Processing with Diamond Hardened Joins"☆12Jul 10, 2024Updated last year
- The University of Bristol HPC Simulation Engine☆109Aug 30, 2025Updated 7 months ago
- Trying to figure various CPU things out☆167Jan 31, 2026Updated 3 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆13Jan 3, 2026Updated 3 months ago
- A house for your mouse so you always know where it's at☆21Dec 2, 2025Updated 4 months ago
- x86-64, ARM, and RVV intrinsics viewer☆78Feb 15, 2026Updated 2 months ago
- Interpreter and compiler for the ISA specification language "Architecture Specification Language" (ASL)☆28Updated this week
- play gemm with tvm☆91Jul 22, 2023Updated 2 years ago
- ☆12Dec 1, 2023Updated 2 years ago
- Pass Rust strings to C with potentially not needing heap allocation☆12Jan 25, 2026Updated 3 months ago
- ☆16Dec 11, 2024Updated last year
- Binary Ninja Plugin for RISC-V☆15Nov 29, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- An LLVM IR dataset for data-driven compiler optimization research☆79Mar 17, 2026Updated last month
- ☆15Dec 4, 2024Updated last year
- ARMv8 SHA256 instructions implementation in C☆30Apr 28, 2016Updated 10 years ago
- A utility for building Vulkan API layer drivers, and a some off-the-shelf layers for the Arm Immortalis and Arm Mali GPUs.☆39Apr 21, 2026Updated last week
- Turn your Apple Watch into an ammeter to measure DC currents☆195Sep 15, 2024Updated last year
- An introduction to ARM64 assembly on Apple Silicon Macs☆4,949Apr 6, 2026Updated 3 weeks ago
- Playing with the Metal Performance Shaders matrix multiplication kernel☆27Feb 22, 2017Updated 9 years ago