A portable high-level API with CUDA or OpenCL back-end
☆56Oct 8, 2017Updated 8 years ago
Alternatives and similar repositories for CLCudaAPI
Users that are interested in CLCudaAPI are comparing it to the libraries listed below
Sorting:
- Greentea LibDNN - a universal convolution implementation supporting CUDA and OpenCL☆137Apr 20, 2017Updated 8 years ago
- BLAS OpenCL implementation.☆16Apr 8, 2015Updated 10 years ago
- CLTune: An automatic OpenCL & CUDA kernel tuner☆185Dec 12, 2022Updated 3 years ago
- Caffe: a fast open framework for deep learning.☆14Aug 26, 2015Updated 10 years ago
- An optimized OpenCL implementation of Gradient Vector Flow (GVF) that runs on GPUs and CPUs for both 2D and 3D. For more details about th…☆30Jan 15, 2017Updated 9 years ago
- GPU Automatically Tuned Linear Algebra Software☆28Sep 1, 2015Updated 10 years ago
- PyTorch bindings for openai-gemm☆20Feb 6, 2017Updated 9 years ago
- OpenCL multiGPU sample monitoring system health☆22Feb 25, 2016Updated 10 years ago
- WebCL conformance tests☆20Feb 9, 2018Updated 8 years ago
- demonstration for our ACL 2018 paper, "On the Practical Computational Power of Finite Precision RNNs for Language Recognition"☆11May 26, 2019Updated 6 years ago
- Fast binary matrix product on CPU☆10Feb 11, 2016Updated 10 years ago
- a software library containing BLAS functions written in OpenCL☆865Aug 2, 2024Updated last year
- Idiomatic Python bindings for Google Go☆22Dec 11, 2019Updated 6 years ago
- ☆14Mar 21, 2019Updated 6 years ago
- Generating Families of Practical Fast Matrix Multiplication Algorithms☆12Jul 7, 2017Updated 8 years ago
- Tuned OpenCL BLAS☆1,168Feb 1, 2026Updated last month
- Libre version of David Hadash☆14Feb 6, 2024Updated 2 years ago
- Programming on the GPU using OpenCL☆12Jun 17, 2011Updated 14 years ago
- VexCL is a C++ vector expression template library for OpenCL/CUDA/OpenMP☆718Jul 19, 2025Updated 7 months ago
- Train Neuronal networks to automate your home☆19Mar 1, 2023Updated 3 years ago
- portDNN is a library implementing neural network algorithms written using SYCL☆114May 21, 2024Updated last year
- assembler for NVIDIA FERMI. Imported from Google Code☆75Mar 22, 2015Updated 10 years ago
- Least Squares Generative Adversarial Network implemented in Chainer☆18Dec 11, 2017Updated 8 years ago
- Direct3D 10 Particle System. All computations via GPU☆22Apr 8, 2014Updated 11 years ago
- Instructions and templates for SC authors☆17Aug 22, 2021Updated 4 years ago
- An universal deep learning models conversor☆141Sep 23, 2016Updated 9 years ago
- Qt Network Authenticators; QtOAuth in particular☆25Updated this week
- Reference Hardware Implementations of Bit Extract/Deposit Instructions☆24Oct 31, 2017Updated 8 years ago
- Easy to run kernels using OpenCL☆187Apr 22, 2025Updated 10 months ago
- Code appendix to an OpenCL matrix-multiplication tutorial☆179Feb 7, 2017Updated 9 years ago
- CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.☆127Jan 17, 2023Updated 3 years ago
- Khronos OpenCL-CLHPP☆414Updated this week
- A replacement for libtool written in C☆38Jan 10, 2014Updated 12 years ago
- a software library containing Sparse functions written in OpenCL☆177Feb 21, 2020Updated 6 years ago
- Build NVIDIA® CUDA™ code for OpenCL™ 1.2 devices☆874Apr 23, 2025Updated 10 months ago
- C99/C++ header-only library for division via fixed-point multiplication by inverse☆60Apr 14, 2024Updated last year
- CL Offline Compiler : Compile OpenCL kernels to HSAIL☆51May 5, 2017Updated 8 years ago
- Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line☆24Nov 25, 2025Updated 3 months ago
- the C++ version of Seq2Seq with ncnn☆23Jun 27, 2021Updated 4 years ago