adam-maj / tiny-gpuLinks
A minimal GPU design in Verilog to learn how GPUs work from the ground up
☆8,937Updated last year
Alternatives and similar repositories for tiny-gpu
Users that are interested in tiny-gpu are comparing it to the libraries listed below
Sorting:
- A lightweight library for portable low-level GPU computation using WebGPU.☆3,921Updated last month
- LLM training in simple, raw C/CUDA☆28,257Updated 5 months ago
- OpenSource GPU, in Verilog, loosely based on RISC-V ISA☆1,129Updated last year
- lightweight, standalone C++ inference engine for Google's Gemma models.☆6,622Updated this week
- llama3 implementation one matrix multiplication at a time☆15,191Updated last year
- Solve puzzles. Learn CUDA.☆11,759Updated last year
- High-speed Large Language Model Serving for Local Deployment☆8,420Updated 4 months ago
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,162Updated 3 months ago
- Implementation for MatMul-free LM.☆3,038Updated 4 months ago
- ☆1,794Updated last week
- Tile primitives for speedy kernels☆2,955Updated this week
- ☆1,277Updated last year
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆10,183Updated last year
- Material for gpu-mode lectures☆5,355Updated last week
- On-device AI across mobile, embedded and edge for PyTorch☆3,594Updated this week
- Open-source high-performance RISC-V processor☆6,762Updated this week
- NanoGPT (124M) in 3 minutes☆3,911Updated last week
- Learning FPGA, yosys, nextpnr, and RISC-V☆3,299Updated 2 weeks ago
- Chisel: A Modern Hardware Design Language☆4,483Updated this week
- The official PyTorch implementation of Google's Gemma models☆5,578Updated 6 months ago
- Puzzles for learning Triton☆2,143Updated last year
- A minimal tensor processing unit (TPU), inspired by Google's TPU V2 and V1☆1,026Updated 3 months ago
- Inference Llama 2 in one file of pure C☆18,988Updated last year
- 3D Visualization of an GPT-style LLM☆5,134Updated last year
- Modeling, training, eval, and inference code for OLMo☆6,197Updated last week
- ☆1,519Updated 4 months ago
- Tensor library for machine learning☆13,648Updated last week
- A machine learning compiler for GPUs, CPUs, and ML accelerators☆3,714Updated last week
- GPU programming related news and material links☆1,803Updated 2 months ago
- A graphical processor simulator and assembly editor for the RISC-V ISA☆3,118Updated 3 weeks ago