sol-prog / cuda_cublas_curand_thrustLinks
☆22Updated 13 years ago
Alternatives and similar repositories for cuda_cublas_curand_thrust
Users that are interested in cuda_cublas_curand_thrust are comparing it to the libraries listed below
Sorting:
- Developer repository for ViennaCL. Visit http://viennacl.sourceforge.net/ for the latest releases.☆288Updated 3 years ago
- A CUDNN minimal deep learning training code sample using LeNet.☆267Updated last year
- Code appendix to an OpenCL matrix-multiplication tutorial☆173Updated 8 years ago
- CUSP : A C++ Templated Sparse Matrix Library☆415Updated last month
- Source code that accompanies The CUDA Handbook.☆529Updated 5 months ago
- a software library containing Sparse functions written in OpenCL☆175Updated 5 years ago
- Lecture Slide Issue Tracking☆253Updated 7 years ago
- Source code examples from the Parallel Forall Blog☆1,298Updated last year
- a software library containing BLAS functions written in OpenCL☆857Updated 11 months ago
- ulmBLAS☆108Updated last month
- Multi-GPU Computing Benchmark Suite (CUDA)☆42Updated 8 years ago
- CLTune: An automatic OpenCL & CUDA kernel tuner☆180Updated 2 years ago
- A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory☆297Updated 6 years ago
- CUDA Tensor Transpose (cuTT) library☆52Updated 7 years ago
- sparse matrix pre-processing library☆83Updated last year
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆47Updated 10 years ago
- Source code examples from the Parallel Forall Blog☆96Updated 6 years ago
- CUDA C++ package for Sublime Text 2 & 3☆68Updated 7 years ago
- Optimized half precision gemm assembly kernels (deprecated due to ROCm)☆47Updated 8 years ago
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆885Updated this week
- C, C++ and Python Code for Exercises and Solutions☆521Updated 5 years ago
- a heterogeneous multiGPU level-3 BLAS library☆45Updated 5 years ago
- A few cuda examples built with cmake☆23Updated 6 years ago
- High-Performance Tensor Transpose library☆200Updated 2 years ago
- Fork of magma to include more BLAS☆28Updated 8 years ago
- The SHOC Benchmark Suite☆256Updated 3 years ago
- kmeans☆55Updated 9 years ago
- Easy to run kernels using OpenCL☆185Updated 3 months ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆261Updated 6 months ago
- Full-speed Array of Structures access☆172Updated 2 years ago