ShoYamanishi / AppleNumericalComputingLinks

Study and Implementations of Numerical Algorithms on Apple M1 and A* Devices

☆144

Alternatives and similar repositories for AppleNumericalComputing

Users that are interested in AppleNumericalComputing are comparing it to the libraries listed below

Sorting:

larsgeb / m1-gpu-cpp
Metal Shading Language on Apple M1's GPU for scientific C++.
☆96Updated last year
philipturner / metal-float64
Emulating double-precision arithmetic on Apple GPUs
☆55Updated 2 years ago
philipturner / metal-benchmarks
Apple GPU microarchitecture
☆540Updated 10 months ago
microsoft / ArchProbe
A profiler to disclose and quantify hardware features on GPUs.
☆173Updated 3 years ago
bkvogel / metal_performance_testing
Scientific computing with Metal in C++: Matrix multiplication example
☆36Updated 2 years ago
enp1s0 / ozIMMU
FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme
☆80Updated 4 months ago
philipturner / amx-benchmarks
Running linear algebra as fast as possible on Apple silicon
☆21Updated last year
MuGdxy / muda
μ-Cuda, COVER THE LAST MILE OF CUDA. With features: intellisense-friendly, structured launch, automatic cuda graph generation and updatin…
☆183Updated last month
ptheywood / cuda-cmake-github-actions
☆59Updated 11 months ago
ProjectPhysX / PTXprofiler
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
☆55Updated 4 months ago
baldand / py-metal-compute
A python library to run metal compute kernels on macOS
☆80Updated 6 months ago
Ahdhn / CUDATemplate
Template for starting CUDA/C++ project using CMake with Github Action for CI
☆31Updated last month
horizon-research / rtnn
☆67Updated 2 years ago
owensgroup / BGHT
BGHT: High-performance static GPU hash tables.
☆70Updated last month
NVIDIA / nsight-vscode-edition
A Visual Studio Code extension for building and debugging CUDA applications.
☆87Updated 2 weeks ago
xrq-phys / blis_apple
BLIS fork with kernels for Apple M1. (Perhaps) The first open-source BLAS with Apple Matrix Coprocessor support.
☆35Updated 2 years ago
mark-poscablo / gpu-prefix-sum
CUDA implementation of exclusive prefix sum via Blelloch's algorithm
☆28Updated 8 years ago
harrism / ranger
Generate simple index ranges in C++ and CUDA C++
☆39Updated 2 years ago
PatWie / cuda-design-patterns
Some CUDA design patterns and a bit of template magic for CUDA
☆156Updated 2 years ago
philipturner / applegpuinfo
Print all known information about the GPU on Apple-designed chips
☆86Updated 11 months ago
NVlabs / cub
THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.
☆84Updated last year
gpuocelot / gpuocelot
GPUOcelot: A dynamic compilation framework for PTX
☆207Updated 6 months ago
taichi-dev / quantaichi
QuanTaichi evaluation suite
☆161Updated last year
fynv / ThrustRTC
CUDA tool set for non-C++ languages that provides similar functionality like Thrust, with NVRTC at its core.
☆59Updated 2 years ago
taichi-dev / taichi-aot-demo
A demo illustrating how to use Taichi as an AOT shader compiler
☆73Updated 4 months ago
Abeynaya / spaQR_public
Public repository of sparsified QR codes
☆9Updated 3 years ago
iree-org / iree-nvgpu
☆50Updated last year
salykova / sgemm.cu
High-Performance SGEMM on CUDA devices
☆98Updated 6 months ago
NVIDIA / NVTX
The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…
☆434Updated this week
Jimver / cuda-toolkit
GitHub Action to install CUDA
☆183Updated this week