adam-maj / tiny-gpu
A minimal GPU design in Verilog to learn how GPUs work from the ground up
☆8,049Updated 7 months ago
Alternatives and similar repositories for tiny-gpu:
Users that are interested in tiny-gpu are comparing it to the libraries listed below
- OpenSource GPU, in Verilog, loosely based on RISC-V ISA☆965Updated 4 months ago
- Open-source high-performance RISC-V processor☆6,256Updated this week
- Implementation for MatMul-free LM.☆2,977Updated 5 months ago
- LLM training in simple, raw C/CUDA☆26,241Updated 6 months ago
- A lightweight library for portable low-level GPU computation using WebGPU.☆3,850Updated 3 weeks ago
- lightweight, standalone C++ inference engine for Google's Gemma models.☆6,334Updated last week
- Inference Llama 2 in one file of pure C☆18,258Updated 8 months ago
- Solve puzzles. Learn CUDA.☆10,834Updated 7 months ago
- CoreNet: A library for training deep neural networks☆7,001Updated 5 months ago
- Tensor library for machine learning☆12,272Updated this week
- ☆1,465Updated 3 weeks ago
- If tinygrad wasn't small enough for you...☆709Updated last year
- Tile primitives for speedy kernels☆2,227Updated this week
- Modern C++ Programming Course (C++03/11/14/17/20/23/26)☆13,065Updated last month
- A machine learning compiler for GPUs, CPUs, and ML accelerators☆3,060Updated this week
- Material for gpu-mode lectures☆4,180Updated 2 months ago
- The book "Performance Analysis and Tuning on Modern CPU"☆2,940Updated last month
- A computer science textbook☆4,030Updated 7 months ago
- CUDA Templates for Linear Algebra Subroutines☆7,233Updated this week
- Berkeley's Spatial Array Generator☆918Updated last month
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆5,912Updated 3 weeks ago
- The official PyTorch implementation of Google's Gemma models☆5,412Updated 2 weeks ago
- NanoGPT (124M) in 3 minutes☆2,465Updated last week
- On-device AI across mobile, embedded and edge for PyTorch☆2,694Updated this week
- A PyTorch native library for large model training☆3,562Updated this week
- Blazingly fast LLM inference.☆5,409Updated this week
- A retargetable MLIR-based machine learning compiler and runtime toolkit.☆3,082Updated this week
- RISC-V CPU simulator for education purposes☆543Updated this week
- ChampSim is an open-source trace based simulator maintained at Texas A&M University and through the support of the computer architecture …☆577Updated last week
- High-speed Large Language Model Serving for Local Deployment☆8,167Updated last month