NVIDIA / accelerated-computing-hub
NVIDIA curated collection of educational resources related to general purpose GPU programming.
☆401Updated last week
Alternatives and similar repositories for accelerated-computing-hub:
Users that are interested in accelerated-computing-hub are comparing it to the libraries listed below
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆165Updated last month
- NVIDIA Math Libraries for the Python Ecosystem☆286Updated last month
- CUDA Matrix Multiplication Optimization☆181Updated 9 months ago
- Examples demonstrating available options to program multiple GPUs in a single node or a cluster☆689Updated 2 months ago
- Step-by-step optimization of CUDA SGEMM☆310Updated 3 years ago
- Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial☆255Updated last month
- A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")☆318Updated this week
- Fastest kernels written from scratch☆236Updated 3 weeks ago
- NVIDIA tools guide☆129Updated 3 months ago
- KvikIO - High Performance File IO☆206Updated this week
- Some CUDA example code with READMEs.☆94Updated last month
- GPU programming related news and material links☆1,461Updated 3 months ago
- ☆152Updated 8 months ago
- A plugin for Jupyter Notebook to run CUDA C/C++ code☆225Updated 7 months ago
- Slides, notes, and materials for the workshop☆324Updated 10 months ago
- Kernel Tuner☆328Updated last week
- CUDA Kernel Benchmarking Library☆621Updated this week
- ☆153Updated last year
- Evaluating Large Language Models for CUDA Code Generation ComputeEval is a framework designed to generate and evaluate CUDA code from Lar…☆29Updated 3 weeks ago
- Experimental projects related to TensorRT☆97Updated this week
- Cataloging released Triton kernels.☆217Updated 3 months ago
- The CUDA target for Numba☆106Updated this week
- Training material for Nsight developer tools☆156Updated 8 months ago
- ☆537Updated this week
- The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resou…☆376Updated 2 weeks ago
- Samples demonstrating how to use the Compute Sanitizer Tools and Public API☆79Updated last year
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆179Updated last year
- Fast CUDA matrix multiplication from scratch☆691Updated last year
- The Foundation for All Legate Libraries☆213Updated this week
- Examples from Programming in Parallel with CUDA☆134Updated 2 years ago