tpn / cuda-samplesLinks
☆60Updated 2 years ago
Alternatives and similar repositories for cuda-samples
Users that are interested in cuda-samples are comparing it to the libraries listed below
Sorting:
- Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"☆88Updated last year
- ☆67Updated 11 years ago
- An extension library of WMMA API (Tensor Core API)☆97Updated 10 months ago
- portDNN is a library implementing neural network algorithms written using SYCL☆113Updated last year
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆84Updated last year
- Matrix Algebra on GPU and Multicore Architectures (MAGMA) source releases from http://icl.cs.utk.edu/magma/index.html☆23Updated 10 years ago
- Examples for using SYCL on CUDA☆62Updated 3 months ago
- Intel Data Parallel C++ (and SYCL 2020) Tutorial.☆93Updated 3 years ago
- Full-speed Array of Structures access☆169Updated 2 years ago
- An implementation of parallel exclusive scan in CUDA☆62Updated 7 years ago
- Source code examples from the Parallel Forall Blog☆96Updated 6 years ago
- CuPBoP-AMD is a CUDA translator that translates CUDA programs at NVVM IR level to HIP-compatible IR that can run on AMD GPUs.☆37Updated last year
- 🎃 GPU load-balancing library for regular and irregular computations.☆62Updated 11 months ago
- CUDA official sample codes☆368Updated 9 years ago
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆105Updated 7 years ago
- CUDA kernel author's tools☆111Updated 3 years ago
- Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)☆134Updated 4 years ago
- Archived implementation of BLAS using the SYCL open standard. See oneMath for a replacement.☆261Updated 4 months ago
- Next generation LAPACK implementation for ROCm platform☆101Updated this week
- Training material for Nsight developer tools☆157Updated 9 months ago
- Examples for HIP☆207Updated 6 months ago
- ☆58Updated 9 months ago
- AMD ROCm Performance Primitives (RPP) library is a comprehensive high-performance computer vision library for AMD processors with HIP/Ope…☆64Updated this week
- A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.☆52Updated 2 months ago
- Some CUDA design patterns and a bit of template magic for CUDA☆154Updated 2 years ago
- BGHT: High-performance static GPU hash tables.☆65Updated last month
- LaTeX Examples Document Source☆243Updated 5 months ago
- Subset of BLAS routines optimized for NVIDIA GPUs☆68Updated 2 years ago
- SYCL Benchmark Suite☆64Updated 3 months ago
- rocWMMA☆114Updated this week