CUDA Learning guide
☆531Jun 20, 2024Updated last year
Alternatives and similar repositories for Parallel-Computing-Cuda-C
Users that are interested in Parallel-Computing-Cuda-C are comparing it to the libraries listed below
Sorting:
- NVIDIA tools guide☆162Jan 7, 2025Updated last year
- Read custom dataset☆12Mar 31, 2023Updated 2 years ago
- GPU programming related news and material links☆1,997Sep 17, 2025Updated 5 months ago
- Examples from Programming in Parallel with CUDA☆170Feb 5, 2026Updated 3 weeks ago
- Setup Cuda☆26May 23, 2024Updated last year
- Learn CUDA with PyTorch☆231Feb 23, 2026Updated last week
- Multi-heap-sort for many small arrays, quicksort with 3 pivots for one big array, CUDA acceleration, CUDA memory compression.☆13Sep 29, 2024Updated last year
- My study notes and hands-on projects for CUDA-based GPU programming☆10Dec 11, 2025Updated 2 months ago
- Learn CUDA Programming, published by Packt☆1,231Dec 30, 2023Updated 2 years ago
- Solve puzzles. Learn CUDA.☆11,959Sep 1, 2024Updated last year
- Standalone commandline CLI tool for compiling Triton kernels☆20Sep 13, 2024Updated last year
- ☆3,316Feb 7, 2026Updated 3 weeks ago
- GPU Kernels☆221Apr 27, 2025Updated 10 months ago
- A 120-day CUDA learning plan covering daily concepts, exercises, pitfalls, and references (including “Programming Massively Parallel Proc…☆869Mar 29, 2025Updated 11 months ago
- Implement Neural Networks in Cuda from Scratch☆24May 17, 2024Updated last year
- Fast low-bit matmul kernels in Triton☆433Feb 1, 2026Updated last month
- UNet diffusion model in pure CUDA☆657Jun 28, 2024Updated last year
- Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"☆95Aug 14, 2023Updated 2 years ago
- Skeleton code and visualization for basic molecular dynamics simulator☆27Dec 12, 2021Updated 4 years ago
- Material for gpu-mode lectures☆5,773Feb 1, 2026Updated last month
- ☆90Nov 11, 2025Updated 3 months ago
- Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)☆942Aug 19, 2024Updated last year
- Step by step implementation of a fast softmax kernel in CUDA☆61Jan 6, 2025Updated last year
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆253May 6, 2025Updated 9 months ago
- High Quality Resources on GPU Programming/Architecture☆593Jul 26, 2024Updated last year
- CUDA Templates and Python DSLs for High-Performance Linear Algebra☆9,315Updated this week
- Implementation from scratch in CUDA C++ of image processing algorithms.☆21Oct 26, 2020Updated 5 years ago
- Fast CUDA matrix multiplication from scratch☆1,060Sep 2, 2025Updated 5 months ago
- CUDA Library Samples☆2,324Feb 21, 2026Updated last week
- Samples for CUDA Developers which demonstrates features in CUDA Toolkit☆8,870Jan 6, 2026Updated last month
- CUDA Core Compute Libraries☆2,182Updated this week
- ☆454Dec 18, 2025Updated 2 months ago
- Step-by-step optimization of CUDA SGEMM☆432Mar 30, 2022Updated 3 years ago
- Learnings and programs related to CUDA☆433Jun 29, 2025Updated 8 months ago
- ☆25Nov 12, 2025Updated 3 months ago
- ☆91Feb 29, 2024Updated 2 years ago
- Mixed precision training from scratch with Tensors and CUDA☆28May 14, 2024Updated last year
- Tool to train/test models on 3d point cloud segmentation☆10Jun 14, 2025Updated 8 months ago
- ☆10Nov 16, 2024Updated last year