CNugteren / CLBlastLinks
Tuned OpenCL BLAS
☆1,121Updated 2 weeks ago
Alternatives and similar repositories for CLBlast
Users that are interested in CLBlast are comparing it to the libraries listed below
Sorting:
- a software library containing BLAS functions written in OpenCL☆857Updated 11 months ago
- pocl - Portable Computing Language☆1,005Updated this week
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆881Updated last week
- A tool which profiles OpenCL devices to find their peak capacities☆457Updated 3 weeks ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆382Updated last week
- Assembler for NVIDIA Maxwell architecture☆1,012Updated 2 years ago
- [ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl☆1,757Updated last year
- Patterns and behaviors for GPU computing☆1,726Updated 3 years ago
- An OpenCL device simulator and debugger☆357Updated 10 months ago
- Developer repository for ViennaCL. Visit http://viennacl.sourceforge.net/ for the latest releases.☆288Updated 3 years ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆261Updated 5 months ago
- Khronos OpenCL-CLHPP☆401Updated last month
- VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP☆714Updated 2 months ago
- Code appendix to an OpenCL matrix-multiplication tutorial☆173Updated 8 years ago
- Build NVIDIA® CUDA™ code for OpenCL™ 1.2 devices☆860Updated 2 months ago
- CLTune: An automatic OpenCL & CUDA kernel tuner☆180Updated 2 years ago
- AMD's Machine Intelligence Library☆1,163Updated last week
- Print all known information about all available OpenCL platforms and devices in the system☆351Updated 2 weeks ago
- Khronos OpenCL-Headers☆719Updated this week
- The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs☆1,306Updated 2 months ago
- a software library containing FFT functions written in OpenCL☆637Updated 2 years ago
- Collection of samples and utilities for using ComputeCpp, Codeplay's SYCL implementation☆325Updated last year
- Intercept Layer for Debugging and Analyzing OpenCL Applications☆332Updated this week
- The OpenCL ICD Loader project.☆268Updated 3 weeks ago
- A C++ GPU Computing Library for OpenCL☆1,616Updated 2 months ago
- Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, …☆1,660Updated this week
- CUDA Data Parallel Primitives Library☆432Updated 6 years ago
- A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)☆404Updated 5 months ago
- Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group☆443Updated 8 months ago
- GPUOCelot: A dynamic compilation framework for PTX☆288Updated last year