noppoMan / python-metal-benchmarkLinks
An experimental repo for accessing Metal API from Python (OSX Only)
☆23Updated 5 years ago
Alternatives and similar repositories for python-metal-benchmark
Users that are interested in python-metal-benchmark are comparing it to the libraries listed below
Sorting:
- A python library to run metal compute kernels on macOS☆85Updated 10 months ago
- 3X speedup over Apple’s TensorFlow plugin by using Apache TVM on M1☆138Updated 3 years ago
- ☆54Updated 4 years ago
- Implementation of Karpathy's micrograd in Mojo☆78Updated 2 years ago
- Study and Implementations of Numerical Algorithms on Apple M1 and A* Devices☆149Updated 2 years ago
- FlashAttention (Metal Port)☆557Updated last year
- C API for MLX☆153Updated last week
- ☆54Updated last year
- Print all known information about the GPU on Apple-designed chips☆93Updated last month
- Renderer for molecular nanotechnology☆84Updated this week
- Emulating double-precision arithmetic on Apple GPUs☆55Updated 2 years ago
- Apple GPU microarchitecture☆562Updated last year
- Metal Shading Language on Apple M1's GPU for scientific C++.☆104Updated 2 years ago
- Machine Learning library for the emerging Mojo/Python ecosystem☆296Updated last week
- Nod.ai 🦈 version of 👻 . You probably want to start at https://github.com/nod-ai/shark for the product and the upstream IREE repository …☆107Updated this week
- Convert StableHLO models into Apple Core ML format☆19Updated last week
- Benchmarking OpenBLAS on the Apple M1☆18Updated 4 years ago
- A Learning Journey: Micrograd in Mojo 🔥☆63Updated last year
- Swift for NNC☆77Updated this week
- Running linear algebra as fast as possible on Apple silicon☆27Updated 2 years ago
- Training MLP on MNIST in 1.5 seconds with pure CUDA☆46Updated last year
- ☆56Updated 2 years ago
- 1D, 2D, and 3D variations of Fast Fourier Transforms☆33Updated 3 years ago
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- Exploring the scalable matrix extension of the Apple M4 processor☆211Updated last year
- The Foundation for All Legate Libraries☆232Updated last week
- High-Performance SGEMM on CUDA devices☆112Updated 10 months ago
- LLM training in simple, raw C/CUDA☆108Updated last year
- LLM training in simple, raw C/Metal Shading Language☆60Updated last year
- Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator☆214Updated last year