ShoYamanishi / AppleNumericalComputingLinks
Study and Implementations of Numerical Algorithms on Apple M1 and A* Devices
☆141Updated 2 years ago
Alternatives and similar repositories for AppleNumericalComputing
Users that are interested in AppleNumericalComputing are comparing it to the libraries listed below
Sorting:
- FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme☆70Updated 2 months ago
- Metal Shading Language on Apple M1's GPU for scientific C++.☆93Updated last year
- Running linear algebra as fast as possible on Apple silicon☆20Updated last year
- Scientific computing with Metal in C++: Matrix multiplication example☆29Updated 2 years ago
- BLIS fork with kernels for Apple M1. (Perhaps) The first open-source BLAS with Apple Matrix Coprocessor support.☆35Updated 2 years ago
- Emulating double-precision arithmetic on Apple GPUs☆52Updated 2 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Updated last year
- BGHT: High-performance static GPU hash tables.☆65Updated last month
- A python library to run metal compute kernels on macOS☆78Updated 4 months ago
- μ-Cuda, COVER THE LAST MILE OF CUDA. With features: intellisense-friendly, structured launch, automatic cuda graph generation and updatin…☆176Updated this week
- ☆66Updated 2 years ago
- Template for starting CUDA/C++ project using CMake with Github Action for CI☆29Updated 2 years ago
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆52Updated 2 months ago
- Exploring the scalable matrix extension of the Apple M4 processor☆176Updated 6 months ago
- Fast and full-featured Matrix Market I/O library for C++, Python, and R☆79Updated 10 months ago
- Apple GPU microarchitecture☆522Updated 8 months ago
- Benchmarking OpenBLAS on the Apple M1☆18Updated 4 years ago
- This is a demo how to write a high performance convolution run on apple silicon☆54Updated 3 years ago
- Metal-cpp is a low-overhead C++ interface for Metal that helps developers add Metal functionality to graphics apps, games, and game engin…☆307Updated 5 months ago
- ☆58Updated 9 months ago
- A profiler to disclose and quantify hardware features on GPUs.☆169Updated 3 years ago
- ☆50Updated last year
- QuanTaichi evaluation suite☆158Updated last year
- GPUOcelot: A dynamic compilation framework for PTX☆192Updated 3 months ago
- MatMul Performance Benchmarks for a Single CPU Core comparing both hand engineered and codegen kernels.☆131Updated last year
- A demo illustrating how to use Taichi as an AOT shader compiler☆73Updated last month
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆84Updated last year
- Software library for FDTD of viscoelastic equation using a staggered grid arrangement with support for GPU and CPU backends☆56Updated 2 months ago
- An experimental repo for accessing Metal API from Python (OSX Only)☆23Updated 4 years ago
- A Library for fast Hash Tables on GPUs☆119Updated 2 years ago