adam-maj / tiny-gpuLinks
A minimal GPU design in Verilog to learn how GPUs work from the ground up
☆9,388Updated last year
Alternatives and similar repositories for tiny-gpu
Users that are interested in tiny-gpu are comparing it to the libraries listed below
Sorting:
- lightweight, standalone C++ inference engine for Google's Gemma models.☆6,663Updated this week
- LLM training in simple, raw C/CUDA☆28,559Updated 6 months ago
- OpenSource GPU, in Verilog, loosely based on RISC-V ISA☆1,176Updated last year
- A lightweight library for portable low-level GPU computation using WebGPU.☆3,933Updated 3 months ago
- CoreNet: A library for training deep neural networks☆7,022Updated 3 months ago
- The official PyTorch implementation of Google's Gemma models☆5,595Updated 7 months ago
- A PyTorch native platform for training generative AI models☆4,947Updated this week
- llama3 implementation one matrix multiplication at a time☆15,231Updated last year
- Solve puzzles. Improve your pytorch.☆3,872Updated last year
- Open-source high-performance RISC-V processor☆6,834Updated this week
- Learning FPGA, yosys, nextpnr, and RISC-V☆3,360Updated last month
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆10,254Updated last year
- Implementation for MatMul-free LM.☆3,045Updated last month
- Tile primitives for speedy kernels☆3,038Updated this week
- ☆1,852Updated last week
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,175Updated 4 months ago
- Solve puzzles. Learn CUDA.☆11,874Updated last year
- Material for gpu-mode lectures☆5,523Updated last month
- Machine Learning Engineering Open Book☆16,184Updated 3 weeks ago
- Inference Llama 2 in one file of pure C☆19,089Updated last year
- Puzzles for learning Triton☆2,222Updated last year
- CUDA Templates and Python DSLs for High-Performance Linear Algebra☆9,076Updated this week
- A machine learning compiler for GPUs, CPUs, and ML accelerators☆3,880Updated this week
- GPU programming related news and material links☆1,886Updated 3 months ago
- NanoGPT (124M) in 3 minutes☆4,116Updated this week
- Development repository for the Triton language and compiler☆18,098Updated this week
- ☆1,281Updated last year
- A nanoGPT pipeline packed in a spreadsheet☆2,141Updated last year
- A minimal tensor processing unit (TPU), inspired by Google's TPU V2 and V1☆1,119Updated 4 months ago
- A Python framework for accelerated simulation, data generation and spatial computing.☆6,009Updated this week