loganwatchorn / notes-pmppLinks

Notes on "Programming Massively Parallel Processors" by Hwu, Kirk, and Hajj (4th ed.)

☆53

Alternatives and similar repositories for notes-pmpp

Users that are interested in notes-pmpp are comparing it to the libraries listed below

Sorting:

Maharshi-Pandya / cudacodes
Learnings and programs related to CUDA
☆414Updated last month
smolorg / smolgrad
small auto-grad engine inspired from Karpathy's micrograd and PyTorch
☆274Updated 8 months ago
unixpickle / learn-ptx
Learning about CUDA by writing PTX code.
☆133Updated last year
JINO-ROHIT / advanced_ml
☆59Updated last week
apoorvnandan / lilgrad
pytorch from scratch in pure C/CUDA and python
☆40Updated 9 months ago
tgautam03 / xGeMM
Accelerated General (FP32) Matrix Multiplication from scratch in CUDA
☆123Updated 6 months ago
ulrichstern / cuda-convnet
Alex Krizhevsky's original code from Google Code
☆195Updated 9 years ago
SwekeR-463 / kernels
learning & making kernels in cuda / triton
☆22Updated last month
drkennetz / cuda_examples
Some CUDA example code with READMEs.
☆169Updated 5 months ago
1y33 / 100Days
GPU Kernels
☆191Updated 3 months ago
wentasah / mmul-anim
Visualization of cache-optimized matrix multiplication
☆153Updated 4 months ago
linjames0 / Transformer-CUDA
An implementation of the transformer architecture onto an Nvidia CUDA kernel
☆189Updated last year
0xD4rky / Vision-Transformers
This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…
☆228Updated 7 months ago
Laz4rz / GPT-2
Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish
☆172Updated last year
hkproj / 100-days-of-gpu
☆358Updated 3 months ago
joey00072 / Tinytorch
A really tiny autograd engine
☆95Updated 2 months ago
EurekaLabsAI / tensor
The Tensor (or Array)
☆441Updated 11 months ago
cloneofsimo / ptx-tutorial-by-aislop
PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)
☆66Updated 4 months ago
arpitingle / gpu-alpha
High Quality Resources on GPU Programming/Architecture
☆588Updated last year
rkinas / triton-resources
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
☆383Updated 4 months ago
naklecha / llm-inference-optimizations-explained
in this repository, i'm going to implement increasingly complex llm inference optimizations
☆64Updated 2 months ago
MekkCyber / TritonAcademy
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆188Updated 2 months ago
CisMine / GPU-in-ML-DL
Apply GPU in ML and DL
☆52Updated 5 months ago
tugot17 / pmpp
Complete solutions to the Programming Massively Parallel Processors Edition 4
☆450Updated last month
CisMine / Parallel-Computing-Cuda-C
CUDA Learning guide
☆419Updated last year
rkinas / cuda-learning
This repository is a curated collection of resources, tutorials, and practical examples designed to guide you through the journey of mast…
☆363Updated 5 months ago
omkaark / simple-federated-learning
☆96Updated last year
MarioSieg / magnetron
(WIP) A small but powerful, homemade PyTorch from scratch.
☆558Updated this week
mlops-discord / gpu-optimization-workshop
Slides, notes, and materials for the workshop
☆328Updated last year
Laz4rz / leetcode
Intro to leetcodes. Basic techniques, quicksort and hash structures implementation, space and time complexities.
☆96Updated last year