bkvogel / metal_performance_testing
Scientific computing with Metal in C++: Matrix multiplication example
☆22Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for metal_performance_testing
- Metal Shading Language on Apple M1's GPU for scientific C++.☆82Updated last year
- Study and Implementations of Numerical Algorithms on Apple M1 and A* Devices☆124Updated last year
- ROCm Thrust - run Thrust dependent software on AMD GPUs☆100Updated this week
- C++ Template Linear Algebra PACKage☆41Updated this week
- Reference Implementation for stdBLAS☆128Updated 3 weeks ago
- Emulating double-precision arithmetic on Apple GPUs☆47Updated last year
- An implementation of BLAS using the SYCL open standard.☆259Updated 2 weeks ago
- ROCm Parallel Primitives☆162Updated this week
- Next generation LAPACK implementation for ROCm platform☆94Updated this week
- Source code for 'Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL' by James Reinders, Ben A…☆252Updated last month
- C++ HPC Tutorial materials☆48Updated 4 months ago
- GTensor is a multi-dimensional array C++14 header-only library for hybrid GPU development.☆35Updated 2 months ago
- Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda☆80Updated this week
- CS infrastructure components for HPC applications☆157Updated this week
- AMD’s C++ library for accelerating tensor primitives☆35Updated this week
- Simple OpenCL Samples that Build with Khronos Headers and Libs☆88Updated last week
- Reusable software components for ROCm developers☆79Updated this week
- Running linear algebra as fast as possible on Apple silicon☆18Updated last year
- Subset of BLAS routines optimized for NVIDIA GPUs☆65Updated last year
- OpenMPL (Open Math Performance Library) is an open source math libraries, including BLAS, LAPACK, FFT, VML, and others.☆17Updated last year
- Atomistic Spin Simulation Framework☆65Updated 4 years ago
- ROCm SPARSE marshalling library☆69Updated this week
- Counter-based random number generators for C, C++ and CUDA.☆89Updated 9 months ago
- Examples for HIP☆200Updated 2 weeks ago
- SYCL Conformance Tests☆62Updated this week
- SLATE is a distributed, GPU-accelerated, dense linear algebra library targetting current and upcoming high-performance computing (HPC) sy…☆93Updated 3 weeks ago
- Next generation library for iterative sparse solvers for ROCm platform☆76Updated this week
- Software library for FDTD of viscoelastic equation using a staggered grid arrangement with support for GPU and CPU backends☆54Updated 4 months ago
- Copy-hiding array abstraction to automatically migrate data between memory spaces☆106Updated this week
- Header-only C++20 wrapper for MPI 4.0.☆43Updated last year