zchee / cuda-sample
CUDA official sample codes
☆367Updated 9 years ago
Alternatives and similar repositories for cuda-sample:
Users that are interested in cuda-sample are comparing it to the libraries listed below
- CUDA by practice☆126Updated 5 years ago
- Examples demonstrating available options to program multiple GPUs in a single node or a cluster☆697Updated 2 months ago
- ☆436Updated 9 years ago
- Source code examples from the Parallel Forall Blog☆1,284Updated 9 months ago
- Training material for Nsight developer tools☆157Updated 9 months ago
- Thin, unified, C++-flavored wrappers for the CUDA APIs☆837Updated last week
- Instructions, Docker images, and examples for Nsight Compute and Nsight Systems☆131Updated 4 years ago
- ☆537Updated this week
- Source code that accompanies The CUDA Handbook.☆522Updated 3 months ago
- CUDA Kernel Benchmarking Library☆629Updated this week
- ☆67Updated 11 years ago
- CUDA Matrix Multiplication Optimization☆184Updated 9 months ago
- Kernel Tuner☆331Updated last week
- ☆59Updated 2 years ago
- cuDNN sample codes provided by Nvidia☆45Updated 6 years ago
- collection of benchmarks to measure basic GPU capabilities☆369Updated 2 months ago
- This is a list of useful libraries and resources for CUDA development.☆563Updated 7 years ago
- A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).☆533Updated last month
- Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"☆88Updated last year
- A GPU benchmark suite for assessing on-chip GPU memory bandwidth☆104Updated 7 years ago
- Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)☆131Updated 4 years ago
- THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.☆84Updated last year
- CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. …☆413Updated last year
- Example of how to use CUDA with CMake >= 3.8☆69Updated last year
- portDNN is a library implementing neural network algorithms written using SYCL☆113Updated 11 months ago
- CUDA Data Parallel Primitives Library☆430Updated 6 years ago
- Step-by-step optimization of CUDA SGEMM☆315Updated 3 years ago
- Full-speed Array of Structures access☆169Updated 2 years ago
- Some CUDA design patterns and a bit of template magic for CUDA☆150Updated last year
- Assembler for NVIDIA Volta and Turing GPUs☆218Updated 3 years ago