A minimal GPU design in Verilog to learn how GPUs work from the ground up
☆12,136Aug 18, 2024Updated last year
Alternatives and similar repositories for tiny-gpu
Users that are interested in tiny-gpu are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- OpenSource GPU, in Verilog, loosely based on RISC-V ISA☆1,301Nov 22, 2024Updated last year
- LLM training in simple, raw C/CUDA☆29,359Jun 26, 2025Updated 9 months ago
- An open source GPU based off of the AMD Southern Islands ISA.☆1,359Aug 18, 2025Updated 7 months ago
- ☆1,957Apr 3, 2026Updated last week
- Open-source high-performance RISC-V processor☆6,958Updated this week
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- GPGPU processor supporting RISCV-V extension, developed with Chisel HDL☆880Updated this week
- Development repository for the Triton language and compiler☆18,840Updated this week
- Solve puzzles. Learn CUDA.☆12,027Sep 1, 2024Updated last year
- Tile primitives for speedy kernels☆3,304Mar 28, 2026Updated last week
- You like pytorch? You like micrograd? You love tinygrad! ❤️☆32,153Updated this week
- GPGPU microprocessor architecture☆2,185Nov 8, 2024Updated last year
- Inference Llama 2 in one file of pure C☆19,351Aug 6, 2024Updated last year
- Material for gpu-mode lectures☆5,923Feb 1, 2026Updated 2 months ago
- CUDA Templates and Python DSLs for High-Performance Linear Algebra☆9,536Apr 2, 2026Updated last week
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- llama3 implementation one matrix multiplication at a time☆15,244May 23, 2024Updated last year
- LLM inference in C/C++☆101,475Updated this week
- Learning FPGA, yosys, nextpnr, and RISC-V☆3,462Nov 18, 2025Updated 4 months ago
- The CORE-V CVA6 is a highly configurable, 6-stage RISC-V core for both application and embedded applications. Application class configura…☆2,873Mar 23, 2026Updated 2 weeks ago
- Chisel: A Modern Hardware Design Language☆4,626Apr 3, 2026Updated last week
- Verilator open-source SystemVerilog simulator and lint system☆3,517Updated this week
- A lightweight library for portable low-level GPU computation using WebGPU.☆3,960Oct 8, 2025Updated 6 months ago
- lightweight, standalone C++ inference engine for Google's Gemma models.☆6,790Apr 2, 2026Updated last week
- PicoRV32 - A Size-Optimized RISC-V CPU☆4,077Jun 27, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- CoreNet: A library for training deep neural networks☆7,004Oct 9, 2025Updated 6 months ago
- opensouce RISC-V cpu core implemented in Verilog from scratch in one night!☆2,527Jan 7, 2026Updated 3 months ago
- A massively parallel, high-level programming language☆19,198Jun 3, 2025Updated 10 months ago
- An Agile RISC-V SoC Design Framework with in-order cores, out-of-order cores, accelerators, and more☆2,209Apr 1, 2026Updated last week
- Machine Learning Engineering Open Book☆17,642Mar 16, 2026Updated 3 weeks ago
- RISC-V XV6/Linux SoC, marchID: 0x2b☆1,077Mar 3, 2026Updated last month
- A PyTorch native platform for training generative AI models☆5,205Apr 3, 2026Updated last week
- Berkeley's Spatial Array Generator☆1,270Mar 29, 2026Updated last week
- Tensor library for machine learning☆14,340Apr 2, 2026Updated last week
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A FPGA friendly 32 bit RISC-V CPU implementation☆3,095Feb 11, 2026Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆75,637Updated this week
- 📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉☆10,217Updated this week
- The Ultra-Low Power RISC-V Core☆1,796Aug 6, 2025Updated 8 months ago
- Build your hardware, easily!☆3,817Apr 2, 2026Updated last week
- 32-bit Superscalar RISC-V CPU☆1,223Sep 18, 2021Updated 4 years ago
- How to make undergraduates or new graduates ready for advanced computer architecture research or modern CPU design☆632Aug 13, 2024Updated last year