adam-maj / tiny-gpuLinks
A minimal GPU design in Verilog to learn how GPUs work from the ground up
☆8,841Updated last year
Alternatives and similar repositories for tiny-gpu
Users that are interested in tiny-gpu are comparing it to the libraries listed below
Sorting:
- Open-source high-performance RISC-V processor☆6,712Updated this week
- Solve puzzles. Learn CUDA.☆11,616Updated last year
- A lightweight library for portable low-level GPU computation using WebGPU.☆3,915Updated 3 weeks ago
- Inference Llama 2 in one file of pure C☆18,912Updated last year
- OpenSource GPU, in Verilog, loosely based on RISC-V ISA☆1,109Updated 11 months ago
- Material for gpu-mode lectures☆5,222Updated last month
- LLM training in simple, raw C/CUDA☆28,081Updated 4 months ago
- llama3 implementation one matrix multiplication at a time☆15,182Updated last year
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆10,096Updated last year
- lightweight, standalone C++ inference engine for Google's Gemma models.☆6,602Updated last week
- 3D Visualization of an GPT-style LLM☆5,101Updated last year
- Implementation for MatMul-free LM.☆3,034Updated 3 months ago
- A PyTorch native platform for training generative AI models☆4,645Updated this week
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,140Updated 2 months ago
- Solve puzzles. Improve your pytorch.☆3,761Updated last year
- A machine learning compiler for GPUs, CPUs, and ML accelerators☆3,650Updated this week
- Tensor library for machine learning☆13,361Updated this week
- A graphical processor simulator and assembly editor for the RISC-V ISA☆3,102Updated this week
- ☆1,732Updated this week
- Puzzles for learning Triton☆2,090Updated 11 months ago
- NanoGPT (124M) in 3 minutes☆3,755Updated last week
- Machine Learning Engineering Open Book☆15,578Updated last week
- Learning FPGA, yosys, nextpnr, and RISC-V☆3,162Updated 8 months ago
- Tile primitives for speedy kernels☆2,859Updated last week
- RISC-V XV6/Linux SoC, marchID: 0x2b☆983Updated this week
- The book "Performance Analysis and Tuning on Modern CPU"☆3,351Updated 4 months ago
- A deep-dive on the entire history of deep-learning☆1,413Updated last year
- ☆1,276Updated last year
- GPU programming related news and material links☆1,749Updated last month
- CUDA Templates and Python DSLs for High-Performance Linear Algebra☆8,705Updated last week