adam-maj / tiny-gpuLinks
A minimal GPU design in Verilog to learn how GPUs work from the ground up
☆8,784Updated last year
Alternatives and similar repositories for tiny-gpu
Users that are interested in tiny-gpu are comparing it to the libraries listed below
Sorting:
- LLM training in simple, raw C/CUDA☆27,852Updated 3 months ago
- Solve puzzles. Learn CUDA.☆11,569Updated last year
- A lightweight library for portable low-level GPU computation using WebGPU.☆3,904Updated last week
- OpenSource GPU, in Verilog, loosely based on RISC-V ISA☆1,093Updated 10 months ago
- Material for gpu-mode lectures☆5,170Updated 3 weeks ago
- Inference Llama 2 in one file of pure C☆18,848Updated last year
- The n-gram Language Model☆1,446Updated last year
- From the Tensor to Stable Diffusion, a rough outline for a 1 week course.☆1,069Updated last week
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆10,021Updated last year
- Implementation for MatMul-free LM.☆3,031Updated 2 months ago
- Tensor library for machine learning☆13,261Updated this week
- Video+code lecture on building nanoGPT from scratch☆4,423Updated last year
- Minimalist ML framework for Rust☆18,297Updated this week
- llama3 implementation one matrix multiplication at a time☆15,172Updated last year
- NanoGPT (124M) in 3 minutes☆3,176Updated 2 months ago
- Puzzles for learning Triton☆2,031Updated 10 months ago
- A nanoGPT pipeline packed in a spreadsheet☆2,126Updated last year
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API☆12,925Updated last year
- lightweight, standalone C++ inference engine for Google's Gemma models.☆6,591Updated this week
- ☆1,272Updated last year
- Understanding Deep Learning - Simon J.D. Prince☆8,344Updated last month
- Solve puzzles. Improve your pytorch.☆3,737Updated last year
- Deep learning at the speed of light.☆2,550Updated last week
- GPU programming related news and material links☆1,729Updated 3 weeks ago
- A machine learning compiler for GPUs, CPUs, and ML accelerators☆3,595Updated this week
- High Quality Resources on GPU Programming/Architecture☆589Updated last year
- If tinygrad wasn't small enough for you...☆743Updated last year
- creating a tiny tensor library in raw C☆820Updated 7 months ago
- The Autograd Engine☆636Updated last year
- A deep-dive on the entire history of deep-learning☆1,407Updated last year