NVIDIA / CUDALibrarySamplesLinks

CUDA Library Samples

☆2,040

Alternatives and similar repositories for CUDALibrarySamples

Users that are interested in CUDALibrarySamples are comparing it to the libraries listed below

Sorting:

NVIDIA / cccl
CUDA Core Compute Libraries
☆1,805Updated last week
olcf / cuda-training-series
Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)
☆825Updated 11 months ago
PacktPublishing / Learn-CUDA-Programming
Learn CUDA Programming, published by Packt
☆1,173Updated last year
NVIDIA / cub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
☆1,765Updated last year
NVIDIA / cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
☆7,842Updated 2 months ago
NVIDIA / multi-gpu-programming-models
Examples demonstrating available options to program multiple GPUs in a single node or a cluster
☆763Updated 5 months ago
Liu-xiandong / How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several…
☆1,102Updated 2 years ago
NVIDIA / nvbench
CUDA Kernel Benchmarking Library
☆691Updated last week
NVIDIA / cutlass
CUDA Templates for Linear Algebra Subroutines
☆8,149Updated this week
NVIDIA-developer-blog / code-samples
Source code examples from the Parallel Forall Blog
☆1,300Updated last year
deeperlearning / professional-cuda-c-programming
☆450Updated 10 years ago
CoffeeBeforeArch / cuda_programming
Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch
☆852Updated 2 years ago
siboehm / SGEMM_CUDA
Fast CUDA matrix multiplication from scratch
☆782Updated last year
CodedK / CUDA-by-Example-source-code-for-the-book-s-examples-
CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. …
☆432Updated 2 years ago
NVIDIA / NVTX
The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…
☆427Updated last week
NVIDIA / cudnn-frontend
cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it
☆596Updated 2 weeks ago
brucefan1983 / CUDA-Programming
Sample codes for my CUDA programming book
☆1,765Updated 5 months ago
NVIDIA / MatX
An efficient C++17 GPU numerical computing library with Python-like syntax
☆1,341Updated this week
tpoisonooo / how-to-optimize-gemm
row-major matmul optimization
☆649Updated last year
uxlfoundation / oneMath
oneAPI Math Library (oneMath)
☆701Updated 2 weeks ago
NVIDIA / cuCollections
☆557Updated last week
flame / how-to-optimize-gemm
☆1,902Updated 2 years ago
NVIDIA / nccl
Optimized primitives for collective multi-GPU communication
☆3,889Updated last week
wangzyon / NVIDIA_SGEMM_PRACTICE
Step-by-step optimization of CUDA SGEMM
☆362Updated 3 years ago
Bruce-Lee-LY / cuda_hgemm
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruct…
☆445Updated 10 months ago
NVIDIA / gdrcopy
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
☆1,169Updated last month
BBuf / how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
☆2,345Updated this week
KhronosGroup / OpenCL-Guide
A guide to help developers get up and running quickly with the OpenCL programming framework
☆631Updated 11 months ago
yzhaiustc / Optimizing-SGEMM-on-NVIDIA-Turing-GPUs
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
☆369Updated 6 months ago
Tony-Tan / CUDA_Freshman
☆2,503Updated last year