CNugteren / CLBlast
Tuned OpenCL BLAS
☆1,099Updated 2 weeks ago
Alternatives and similar repositories for CLBlast:
Users that are interested in CLBlast are comparing it to the libraries listed below
- a software library containing BLAS functions written in OpenCL☆853Updated 9 months ago
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆869Updated this week
- pocl - Portable Computing Language☆984Updated this week
- VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP☆711Updated 2 weeks ago
- Next generation BLAS implementation for ROCm platform☆367Updated this week
- AMD's Machine Intelligence Library☆1,145Updated this week
- CLTune: An automatic OpenCL & CUDA kernel tuner☆178Updated 2 years ago
- A tool which profiles OpenCL devices to find their peak capacities☆441Updated 4 months ago
- Khronos OpenCL-CLHPP☆397Updated last month
- a software library containing Sparse functions written in OpenCL☆174Updated 5 years ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆261Updated 3 months ago
- HIPIFY: Convert CUDA to Portable C++ Code☆574Updated this week
- [ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl☆1,747Updated last year
- Khronos OpenCL-Headers☆705Updated 3 weeks ago
- Assembler for NVIDIA Maxwell architecture☆996Updated 2 years ago
- a software library containing FFT functions written in OpenCL☆632Updated 2 years ago
- Developer repository for ViennaCL. Visit http://viennacl.sourceforge.net/ for the latest releases.☆286Updated 3 years ago
- Print all known information about all available OpenCL platforms and devices in the system☆343Updated 2 months ago
- A C++ GPU Computing Library for OpenCL☆1,602Updated this week
- An OpenCL device simulator and debugger☆355Updated 8 months ago
- A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)☆397Updated 3 months ago
- oneAPI Math Library (oneMath)☆671Updated this week
- Stretching GPU performance for GEMMs and tensor contractions.☆237Updated this week
- Intercept Layer for Debugging and Analyzing OpenCL Applications☆328Updated last week
- Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group☆441Updated 6 months ago
- Code appendix to an OpenCL matrix-multiplication tutorial☆167Updated 8 years ago
- Collection of samples and utilities for using ComputeCpp, Codeplay's SYCL implementation☆323Updated last year
- The OpenCL ICD Loader project.☆262Updated last week
- HCC is an Open Source, Optimizing C++ Compiler for Heterogeneous Compute currently for the ROCm GPU Computing Platform☆437Updated 4 years ago
- Patterns and behaviors for GPU computing☆1,709Updated 2 years ago