Erkaman / Awesome-CUDA
This is a list of useful libraries and resources for CUDA development.
☆553Updated 7 years ago
Alternatives and similar repositories for Awesome-CUDA:
Users that are interested in Awesome-CUDA are comparing it to the libraries listed below
- A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).☆531Updated last week
- ☆524Updated last week
- CUDA Kernel Benchmarking Library☆593Updated last week
- Source code that accompanies The CUDA Handbook.☆521Updated last month
- Patterns and behaviors for GPU computing☆1,707Updated 2 years ago
- Examples demonstrating available options to program multiple GPUs in a single node or a cluster☆663Updated last month
- Awesome resources for GPUs☆553Updated last year
- Thin, unified, C++-flavored wrappers for the CUDA APIs☆825Updated this week
- [ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl☆1,735Updated last year
- CUDA Data Parallel Primitives Library☆428Updated 6 years ago
- Demonstration of various hardware effects on CUDA GPUs.☆365Updated last year
- CUDA Core Compute Libraries☆1,555Updated this week
- stdgpu: Efficient STL-like Data Structures on the GPU☆1,203Updated last month
- CUSP : A C++ Templated Sparse Matrix Library☆411Updated 4 months ago
- CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. …☆401Updated last year
- A curated list of awesome parallel computing resources☆713Updated 2 years ago
- ☆427Updated 9 years ago
- Training material for Nsight developer tools☆151Updated 7 months ago
- Kernel Tuner☆325Updated this week
- Source code examples from the Parallel Forall Blog☆1,270Updated 8 months ago
- Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.☆329Updated 2 months ago
- Library for specialized dense and sparse matrix operations, and deep learning primitives.☆867Updated this week
- Simple utilities to enable code reuse and portability between CUDA C/C++ and standard C/C++.☆346Updated 2 years ago
- CUDA by practice☆125Updated 5 years ago
- An efficient C++17 GPU numerical computing library with Python-like syntax☆1,300Updated this week
- CUDA Matrix Multiplication Optimization☆173Updated 8 months ago
- CLTune: An automatic OpenCL & CUDA kernel tuner☆177Updated 2 years ago
- This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several…☆960Updated last year
- row-major matmul optimization☆611Updated last year
- CUDA Library Samples☆1,838Updated this week