High Quality Resources on GPU Programming/Architecture
☆593Jul 26, 2024Updated last year
Alternatives and similar repositories for gpu-alpha
Users that are interested in gpu-alpha are comparing it to the libraries listed below
Sorting:
- From the Tensor to Stable Diffusion, a rough outline for a 1 week course.☆1,071Oct 5, 2025Updated 4 months ago
- An ML Systems Onboarding list☆994Feb 19, 2026Updated last week
- This repo is my attempt at a rough implementation of nanoGPT trained on a dataset of 30,000 unique Twitter usernames☆23Apr 7, 2024Updated last year
- Following Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆172Jul 31, 2024Updated last year
- GPU programming related news and material links☆1,997Sep 17, 2025Updated 5 months ago
- Learnings and programs related to CUDA☆433Jun 29, 2025Updated 8 months ago
- From the Transistor to the Web Browser, a rough outline for a 12 week course☆6,435Oct 12, 2021Updated 4 years ago
- some books and papers and stuff☆15Sep 25, 2024Updated last year
- small auto-grad engine inspired from Karpathy's micrograd and PyTorch☆275Nov 21, 2024Updated last year
- Just large language models. Hackable, with as little abstraction as possible. Done for my own purposes, feel free to rip.☆44Sep 6, 2023Updated 2 years ago
- UNet diffusion model in pure CUDA☆657Jun 28, 2024Updated last year
- Solve puzzles. Improve your pytorch.☆3,950Jul 15, 2024Updated last year
- i will automate factorio☆111Jul 31, 2024Updated last year
- Solve puzzles. Learn CUDA.☆11,959Sep 1, 2024Updated last year
- a tiny multidimensional array implementation in C similar to numpy, but only one file.☆225Aug 2, 2024Updated last year
- Machine Learning Engineering Open Book☆17,162Feb 21, 2026Updated last week
- learningggggggg 🐳☆575Apr 2, 2025Updated 10 months ago
- A minimal GPU design in Verilog to learn how GPUs work from the ground up☆11,766Aug 18, 2024Updated last year
- papers.day☆93Dec 15, 2023Updated 2 years ago
- Cerule - A Tiny Mighty Vision Model☆68Nov 9, 2025Updated 3 months ago
- This is a small autograd engine, made purely from numpy and python.☆27Sep 17, 2024Updated last year
- llama3 implementation one matrix multiplication at a time☆15,243May 23, 2024Updated last year
- A miniature version of Modal☆23Jun 11, 2024Updated last year
- LLM101n: Let's build a Storyteller☆36,357Aug 1, 2024Updated last year
- Puzzles for learning Triton☆2,314Nov 18, 2024Updated last year
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Apr 22, 2025Updated 10 months ago
- A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.☆459Mar 10, 2025Updated 11 months ago
- Tutorials on tinygrad☆456Oct 10, 2025Updated 4 months ago
- This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…☆228Jan 2, 2025Updated last year
- You like pytorch? You like micrograd? You love tinygrad! ❤️☆31,424Updated this week
- (WIP) A small but powerful, homemade PyTorch from scratch.☆676Updated this week
- LLM training in simple, raw C/CUDA☆28,940Jun 26, 2025Updated 8 months ago
- A deep-dive on the entire history of deep-learning☆1,531Jul 16, 2024Updated last year
- Cookbook for Crafting Good Code☆57Mar 19, 2024Updated last year
- Simple orchestration for EC2 spot containers☆19Sep 27, 2024Updated last year
- A 120-day CUDA learning plan covering daily concepts, exercises, pitfalls, and references (including “Programming Massively Parallel Proc…☆869Mar 29, 2025Updated 11 months ago
- No frills LLM-assisted programming☆252Jul 24, 2024Updated last year
- See https://github.com/cuda-mode/triton-index/ instead!☆11May 8, 2024Updated last year
- learning & making kernels in cuda / triton☆22Aug 24, 2025Updated 6 months ago