High Quality Resources on GPU Programming/Architecture
☆592Jul 26, 2024Updated last year
Alternatives and similar repositories for gpu-alpha
Users that are interested in gpu-alpha are comparing it to the libraries listed below
Sorting:
- From the Tensor to Stable Diffusion, a rough outline for a 10 week course.☆1,075Mar 11, 2026Updated last week
- An ML Systems Onboarding list☆1,008Feb 19, 2026Updated last month
- This repo is my attempt at a rough implementation of nanoGPT trained on a dataset of 30,000 unique Twitter usernames☆23Apr 7, 2024Updated last year
- Simple Transformer in Jax☆143Jun 22, 2024Updated last year
- Following Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆171Jul 31, 2024Updated last year
- some books and papers and stuff☆15Sep 25, 2024Updated last year
- GPU programming related news and material links☆2,047Mar 8, 2026Updated last week
- Learnings and programs related to CUDA☆434Jun 29, 2025Updated 8 months ago
- From the Transistor to the Web Browser, a rough outline for a 12 week course☆6,455Oct 12, 2021Updated 4 years ago
- small auto-grad engine inspired from Karpathy's micrograd and PyTorch☆275Nov 21, 2024Updated last year
- UNet diffusion model in pure CUDA☆657Jun 28, 2024Updated last year
- Solve puzzles. Improve your pytorch.☆3,985Jul 15, 2024Updated last year
- Solve puzzles. Learn CUDA.☆11,997Sep 1, 2024Updated last year
- i will automate factorio☆113Jul 31, 2024Updated last year
- Just large language models. Hackable, with as little abstraction as possible. Done for my own purposes, feel free to rip.☆44Sep 6, 2023Updated 2 years ago
- Personal solutions to the Triton Puzzles☆20Jul 18, 2024Updated last year
- learningggggggg 🐳☆574Apr 2, 2025Updated 11 months ago
- a tiny multidimensional array implementation in C similar to numpy, but only one file.☆226Aug 2, 2024Updated last year
- Machine Learning Engineering Open Book☆17,440Updated this week
- A minimal GPU design in Verilog to learn how GPUs work from the ground up☆11,969Aug 18, 2024Updated last year
- papers.day☆93Dec 15, 2023Updated 2 years ago
- Tutorials on tinygrad☆462Oct 10, 2025Updated 5 months ago
- llama3 implementation one matrix multiplication at a time☆15,252May 23, 2024Updated last year
- speedrun implementation of dl papers throughout history☆34Mar 19, 2024Updated 2 years ago
- LLM101n: Let's build a Storyteller☆36,489Aug 1, 2024Updated last year
- A really tiny autograd engine☆100May 26, 2025Updated 9 months ago
- You like pytorch? You like micrograd? You love tinygrad! ❤️☆31,592Updated this week
- Puzzles for learning Triton☆2,336Updated this week
- (WIP) A small but powerful, homemade PyTorch from scratch.☆676Mar 14, 2026Updated last week
- Blazingly fast neighborhood attention☆14Nov 28, 2023Updated 2 years ago
- LLM training in simple, raw C/CUDA☆29,143Jun 26, 2025Updated 8 months ago
- A deep-dive on the entire history of deep-learning☆1,549Jul 16, 2024Updated last year
- parallelized hyperdimensional tictactoe☆126Aug 25, 2024Updated last year
- Learning about CUDA by writing PTX code.☆157Feb 27, 2024Updated 2 years ago
- Python tools☆14Oct 22, 2023Updated 2 years ago
- This is a small autograd engine, made purely from numpy and python.☆27Sep 17, 2024Updated last year
- learning & making kernels in cuda / triton☆22Aug 24, 2025Updated 6 months ago
- Let's make sand talk☆588Oct 17, 2023Updated 2 years ago
- This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…☆228Jan 2, 2025Updated last year