CodedK / CUDA-by-Example-source-code-for-the-book-s-examples-Links

CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. The authors introduce each area of CUDA development through working examples.

☆430

Alternatives and similar repositories for CUDA-by-Example-source-code-for-the-book-s-examples-

Users that are interested in CUDA-by-Example-source-code-for-the-book-s-examples- are comparing it to the libraries listed below

Sorting:

deeperlearning / professional-cuda-c-programming
☆449Updated 10 years ago
PacktPublishing / Learn-CUDA-Programming
Learn CUDA Programming, published by Packt
☆1,173Updated last year
olcf / cuda-training-series
Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)
☆820Updated 11 months ago
RichardAns / CUDA-Programs
Examples from Programming in Parallel with CUDA
☆157Updated 2 years ago
wangzyon / NVIDIA_SGEMM_PRACTICE
Step-by-step optimization of CUDA SGEMM
☆357Updated 3 years ago
Cjkkkk / CUDA_gemm
A simple high performance CUDA GEMM implementation.
☆389Updated last year
XiaoSong9905 / CUDA-Optimization-Guide
Xiao's CUDA Optimization Guide [NO LONGER ADDING NEW CONTENT]
☆308Updated 2 years ago
Liu-xiandong / How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several…
☆1,097Updated last year
yzhaiustc / Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
☆368Updated 6 months ago
puttsk / cuda-tutorial
A set of hands-on tutorials for CUDA programming
☆230Updated last year
R100001 / Programming-Massively-Parallel-Processors
☆173Updated 11 months ago
zchee / cuda-sample
CUDA official sample codes
☆371Updated 9 years ago
depctg / udacity-cs344-colab
Google Colab Notebooks for Udacity CS344 - Intro to Parallel Programming
☆135Updated 4 years ago
leimao / CUDA-GEMM-Optimization
CUDA Matrix Multiplication Optimization
☆205Updated last year
siboehm / SGEMM_CUDA
Fast CUDA matrix multiplication from scratch
☆776Updated last year
CUDA-Tutorial / CodeSamples
Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"
☆91Updated last year
NVIDIA / nsight-training
Training material for Nsight developer tools
☆162Updated 11 months ago
ArchaeaSoftware / cudahandbook
Source code that accompanies The CUDA Handbook.
☆529Updated 5 months ago
nvixnu / pmpp__programming_massively_parallel_processors
Examples and exercises from the book Programming Massively Parallel Processors - A Hands-on Approach. David B. Kirk and Wen-mei W. Hwu (T…
☆72Updated 4 years ago
eegkno / CUDA_by_practice
CUDA by practice
☆129Updated 5 years ago
CoffeeBeforeArch / cuda_programming
Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch
☆849Updated 2 years ago
XiaoSong9905 / HPC-Notes
Personal Notes for Learning HPC & Parallel Computation [Active Adding New Content]
☆68Updated 2 years ago
flame / blislab
BLISlab: A Sandbox for Optimizing GEMM
☆531Updated 4 years ago
wzsh / wmma_tensorcore_sample
Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)
☆138Updated 4 years ago
PacktPublishing / Hands-On-GPU-Programming-with-Python-and-CUDA
Hands-On GPU Programming with Python and CUDA, published by Packt
☆390Updated 11 months ago
Bruce-Lee-LY / cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruct…
☆441Updated 10 months ago
njuhope / cuda_sgemm
☆113Updated last year
NVIDIA / multi-gpu-programming-models
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
☆757Updated 5 months ago
FZJ-JSC / tutorial-multi-gpu
Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial
☆282Updated last month
tpoisonooo / how-to-optimize-gemm
row-major matmul optimization
☆648Updated last year