A minimal tensor processing unit (TPU), inspired by Google's TPU V2 and V1
☆1,181Mar 1, 2026Updated this week
Alternatives and similar repositories for tiny-tpu
Users that are interested in tiny-tpu are comparing it to the libraries listed below
Sorting:
- A open source reimplementation of Google's Tensor Processing Unit (TPU).☆735Dec 6, 2017Updated 8 years ago
- A minimal GPU design in Verilog to learn how GPUs work from the ground up☆11,766Aug 18, 2024Updated last year
- A minimal Tensor Processing Unit (TPU) inspired by Google's TPUv1.☆198Aug 10, 2024Updated last year
- ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference☆187Jan 8, 2026Updated last month
- a mini 2x2 systolic array and PE demo☆68Dec 21, 2025Updated 2 months ago
- Pytorch script hot swap: Change code without unloading your LLM from VRAM☆125Apr 21, 2025Updated 10 months ago
- OpenSource GPU, in Verilog, loosely based on RISC-V ISA☆1,273Nov 22, 2024Updated last year
- Tensor library & inference framework for machine learning☆116Oct 3, 2025Updated 5 months ago
- ☆1,908Updated this week
- MR1 formally verified RISC-V CPU☆57Dec 16, 2018Updated 7 years ago
- GPU programming related news and material links☆2,010Sep 17, 2025Updated 5 months ago
- ☆63Apr 22, 2025Updated 10 months ago
- Allo Accelerator Design and Programming Framework (PLDI'24)☆352Feb 8, 2026Updated 3 weeks ago
- Implementation of a Systolic Array based sorting engine on an FPGA using Verilog☆11May 11, 2017Updated 8 years ago
- Tile primitives for speedy kernels☆3,202Feb 24, 2026Updated last week
- NPUsim: Full-Model, Cycle-Level, and Value-Aware Simulator for DNN Accelerators☆50Jan 2, 2025Updated last year
- Advanced Architecture Labs with CVA6☆78Jan 16, 2024Updated 2 years ago
- My 6502 project in a PC104 like formfactor☆37Dec 30, 2025Updated 2 months ago
- ☆12Sep 18, 2024Updated last year
- ☆14Dec 27, 2024Updated last year
- Tensor Processing Unit implementation in Verilog☆13Mar 18, 2025Updated 11 months ago
- For hosting ATS3 and developing CodeDepot☆18Feb 6, 2026Updated 3 weeks ago
- Matrix Accelerator Generator for GeMM Operations based on SIGMA Architecture in CHISEL HDL☆15Mar 21, 2024Updated last year
- Berkeley's Spatial Array Generator☆1,225Updated this week
- A C11 compiler for the discrete logic computer☆21Apr 3, 2024Updated last year
- A FPGA friendly 32 bit RISC-V CPU implementation☆3,032Feb 11, 2026Updated 3 weeks ago
- FSA: Fusing FlashAttention within a Single Systolic Array☆89Aug 12, 2025Updated 6 months ago
- Open-source high-performance RISC-V processor☆6,885Updated this week
- Veryl: A Modern Hardware Description Language☆893Updated this week
- RSD: RISC-V Out-of-Order Superscalar Processor☆1,152Feb 21, 2026Updated last week
- ☆1,075May 18, 2025Updated 9 months ago
- A machine learning compiler for GPUs, CPUs, and ML accelerators☆4,023Updated this week
- Modular hardware build system☆1,130Feb 26, 2026Updated last week
- ☆161Jan 4, 2026Updated 2 months ago
- verilog实现TPU中的脉动阵列计算卷积的module☆159May 10, 2025Updated 9 months ago
- CORE-V Wally is a configurable RISC-V Processor associated with RISC-V System-on-Chip Design textbook. Contains a 5-stage pipeline, suppo…☆488Feb 25, 2026Updated last week
- An implementation of Colin James' "Compiling Lambda Calculus"☆16Sep 29, 2022Updated 3 years ago
- A simple CPU in VHDL for educational purposes☆40Updated this week
- Arithmetic multiplier benchmarks☆12Nov 13, 2017Updated 8 years ago