philipturner / metal-benchmarks
Apple GPU microarchitecture
☆493Updated 4 months ago
Alternatives and similar repositories for metal-benchmarks:
Users that are interested in metal-benchmarks are comparing it to the libraries listed below
- Apple G13 GPU architecture docs and tools☆571Updated 8 months ago
- Exploring the scalable matrix extension of the Apple M4 processor☆162Updated 2 months ago
- Apple AMX Instruction Set☆1,031Updated last month
- Nvidia Instruction Set Specification Generator☆236Updated 6 months ago
- Print all known information about the GPU on Apple-designed chips☆72Updated 5 months ago
- Study and Implementations of Numerical Algorithms on Apple M1 and A* Devices☆130Updated 2 years ago
- FlashAttention (Metal Port)☆430Updated 4 months ago
- Apple Firestorm/Icestorm CPU microarchitecture docs☆231Updated last year
- ☆421Updated last month
- A profiler to disclose and quantify hardware features on GPUs.☆165Updated 2 years ago
- ctypes wrappers for HIP, CUDA, and OpenCL☆128Updated 7 months ago
- CUDA/Metal accelerated language model inference☆494Updated last month
- Emulating double-precision arithmetic on Apple GPUs☆48Updated last year
- GPUOcelot: A dynamic compilation framework for PTX☆161Updated last month
- Extract Metal functions from .metallib files.☆138Updated last year
- Reverse engineered Linux driver for the Apple Neural Engine (ANE).☆385Updated 10 months ago
- Solve Puzzles. Learn Metal 🤘☆503Updated 4 months ago
- Drawing graphics efficiently on Apple Vision using the Metal rendering API☆260Updated 2 months ago
- A micro Vulkan compute pipeline and a collection of benchmarking compute shaders☆230Updated 5 months ago
- Metal-cpp is a low-overhead C++ interface for Metal that helps developers add Metal functionality to graphics apps, games, and game engin…☆289Updated last month
- Sample benchmark demonstrating the VK_KHR_cooperative_matrix extension☆69Updated 3 weeks ago
- Library to manipulate Apple Metal Shading Language IR☆49Updated 2 years ago
- ☆260Updated last month
- "Learn Metal with C++" samples, ported to iOS☆154Updated last year
- C API for MLX☆91Updated this week
- Visualization of cache-optimized matrix multiplication☆102Updated 5 years ago
- Native WebGPU implementation. Mirror of https://dawn.googlesource.com/dawn☆494Updated this week
- Everything we actually know about the Apple Neural Engine (ANE)☆2,139Updated 4 months ago
- Graphics Processing Unit (GPU) Architecture Guide☆180Updated 2 years ago
- throwaway GPT inference☆140Updated 7 months ago