CUDA-Tutorial / CodeSamplesLinks

Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"

☆91

Alternatives and similar repositories for CodeSamples

Users that are interested in CodeSamples are comparing it to the libraries listed below

Sorting:

owensgroup / BGHT
BGHT: High-performance static GPU hash tables.
☆70Updated last month
CodedK / CUDA-by-Example-source-code-for-the-book-s-examples-
CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. …
☆432Updated 2 years ago
RichardAns / CUDA-Programs
Examples from Programming in Parallel with CUDA
☆158Updated 2 years ago
puttsk / cuda-tutorial
A set of hands-on tutorials for CUDA programming
☆230Updated last year
ndd314 / cuda_examples
☆68Updated 11 years ago
horizon-research / rtnn
☆67Updated 2 years ago
GPUPeople / spECK
Efficient SpGEMM on GPU using CUDA and CSR
☆57Updated 2 years ago
leimao / CUDA-GEMM-Optimization
CUDA Matrix Multiplication Optimization
☆213Updated last year
tpn / cuda-samples
☆61Updated 2 years ago
PatWie / cuda-design-patterns
Some CUDA design patterns and a bit of template magic for CUDA
☆156Updated 2 years ago
wzsh / wmma_tensorcore_sample
Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)
☆138Updated 4 years ago
MuGdxy / muda
μ-Cuda, COVER THE LAST MILE OF CUDA. With features: intellisense-friendly, structured launch, automatic cuda graph generation and updatin…
☆183Updated last month
rbaygildin / learn-gpgpu
Algorithms implemented in CUDA + resources about GPGPU
☆56Updated 3 years ago
NVIDIA / nsight-training
Training material for Nsight developer tools
☆163Updated 11 months ago
FZJ-JSC / tutorial-multi-gpu
Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial
☆287Updated last month
ThoenigAdrian / NeuralNetworksCudaTutorial
Implement Neural Networks in Cuda from Scratch
☆23Updated last year
ProjectPhysX / PTXprofiler
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis.
☆55Updated 4 months ago
eegkno / CUDA_by_practice
CUDA by practice
☆129Updated 5 years ago
gevtushenko / matrix_format_performance
☆29Updated 5 years ago
ysh329 / OpenMP-101
Learn OpenMP examples step by step
☆95Updated 6 months ago
deeperlearning / professional-cuda-c-programming
☆452Updated 10 years ago
robertmaynard / code-samples
Source code examples from the Parallel Forall Blog
☆96Updated 6 years ago
NVlabs / cub
THIS REPOSITORY HAS MOVED TO github.com/nvidia/cub, WHICH IS AUTOMATICALLY MIRRORED HERE.
☆84Updated last year
CisMine / Guide-NVIDIA-Tools
NVIDIA tools guide
☆143Updated 6 months ago
ptheywood / cuda-cmake-github-actions
☆59Updated 11 months ago
ingowald / cudaKDTree
☆267Updated last month
owensgroup / SlabHash
A warp-oriented dynamic hash table for GPUs
☆74Updated last year
wangzyon / NVIDIA_SGEMM_PRACTICE
Step-by-step optimization of CUDA SGEMM
☆362Updated 3 years ago
mark-poscablo / gpu-sum-reduction
CUDA implementation of the fundamental sum reduce operation. Aims to be as optimized as reasonable.
☆37Updated 8 years ago
gthparch / CuPBoP-AMD
CuPBoP-AMD is a CUDA translator that translates CUDA programs at NVVM IR level to HIP-compatible IR that can run on AMD GPUs.
☆37Updated last year