udacity / cs344
Introduction to Parallel Programming class code
☆1,294Updated 2 years ago
Related projects: ⓘ
- Source code examples from the Parallel Forall Blog☆1,223Updated last month
- Source code that accompanies The CUDA Handbook.☆493Updated 2 years ago
- Automatically exported from code.google.com/p/cuda-convnet2☆774Updated 8 years ago
- Patterns and behaviors for GPU computing☆1,638Updated 2 years ago
- This is an archive of materials produced for an introductory class on CUDA programming at Stanford University in 2010☆190Updated 2 years ago
- CUDA Data Parallel Primitives Library☆417Updated 5 years ago
- [ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl☆1,669Updated 11 months ago
- ☆382Updated 9 years ago
- Assembler for NVIDIA Maxwell architecture☆940Updated last year
- This is a list of useful libraries and resources for CUDA development.☆508Updated 6 years ago
- My solutions to Udacity's Parallel Programming course (CS 344)☆76Updated 7 years ago
- Low-precision matrix multiplication☆1,772Updated 7 months ago
- ☆1,725Updated last year
- CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. …☆350Updated last year
- Matrix Shadow:Lightweight CPU/GPU Matrix and Tensor Template Library in C++/CUDA for (Deep) Machine Learning☆1,107Updated 5 years ago
- Easy benchmarking of all publicly accessible implementations of convnets☆2,673Updated 7 years ago
- A CUDNN minimal deep learning training code sample using LeNet.☆257Updated last year
- Tutorial code on how to build your own Deep Learning System in 2k Lines☆2,002Updated 5 years ago
- BLISlab: A Sandbox for Optimizing GEMM☆466Updated 3 years ago
- Benchmarking Deep Learning operations on different hardware☆1,065Updated 3 years ago
- CUDA official sample codes☆355Updated 8 years ago
- Learn CUDA Programming, published by Packt☆987Updated 8 months ago
- Acceleration package for neural networks on multi-core CPUs☆1,671Updated 3 months ago
- CNN accelerated by cuda. Test on mnist and finilly get 99.76%☆181Updated 6 years ago
- ☆1,655Updated 6 years ago
- [ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl☆4,907Updated 7 months ago
- ATen: A TENsor library for C++11☆677Updated 4 years ago
- Google Colab Notebooks for Udacity CS344 - Intro to Parallel Programming☆126Updated 3 years ago
- Open single and half precision gemm implementations☆364Updated last year
- Code from the "CUDA Crash Course" YouTube series by CoffeeBeforeArch