KohakuBlueleaf / HakuTPULinks
An AI accelerator implementation with Xilinx FPGA
☆64Updated 9 months ago
Alternatives and similar repositories for HakuTPU
Users that are interested in HakuTPU are comparing it to the libraries listed below
Sorting:
- FREE TPU V3plus for FPGA is the free version of a commercial AI processor (EEP-TPU) for Deep Learning EDGE Inference☆159Updated 2 years ago
 - Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆169Updated last year
 - Research and Materials on Hardware implementation of Transformer Model☆285Updated 8 months ago
 - Small-scale Tensor Processing Unit built on an FPGA☆206Updated 6 years ago
 - Deep Learning Accelerator Based on Eyeriss V2 Architecture with custom RISC-V extended instructions☆201Updated 5 years ago
 - ☆108Updated last year
 - A Framework for Hardware-Aware LLM Exploration☆34Updated this week
 - Machine-Learning Accelerator System Exploration Tools☆179Updated 3 weeks ago
 - FPGA based Vision Transformer accelerator (Harvard CS205)☆134Updated 8 months ago
 - ☆220Updated last year
 - IC implementation of Systolic Array for TPU☆290Updated last year
 - IC implementation of TPU☆135Updated 5 years ago
 - A high-efficiency system-on-chip for floating-point compute workloads.☆43Updated 9 months ago
 - This is a verilog implementation of 4x4 systolic array multiplier☆66Updated 5 years ago
 - FPGA-based hardware accelerator for Vision Transformer (ViT), with Hybrid-Grained Pipeline.☆98Updated 9 months ago
 - Systolic-array based Deep Learning Accelerator generator☆27Updated 4 years ago
 - Fully opensource spiking neural network accelerator☆160Updated 2 years ago
 - ☆54Updated 6 months ago
 - IEEE 754 floating point unit in Verilog☆148Updated 9 years ago
 - PolyLUT is the first quantized neural network training methodology that maps a neuron to a LUT while using multivariate polynomial functi…☆54Updated last year
 - INT8 & FP16 multiplier accumulator (MAC) design with UVM verification completed.☆104Updated 5 years ago
 - A SystemVerilog implementation of Row-Stationary dataflow and Hierarchical Mesh Network-on-Chip Architecture based on Eyeriss CNN Acceler…☆174Updated 5 years ago
 - hardware design of universal NPU(CNN accelerator) for various convolution neural network☆154Updated 7 months ago
 - 16-bit Adder Multiplier hardware on Digilent Basys 3☆82Updated 2 years ago
 - NeuraLUT-Assemble☆43Updated 2 months ago
 - An AXI4 crossbar implementation in SystemVerilog☆178Updated 2 months ago
 - Vector processor for RISC-V vector ISA☆129Updated 5 years ago
 - FPGA/AES/LeNet/VGG16☆108Updated 7 years ago
 - An Open Workflow to Build Custom SoCs and run Deep Models at the Edge☆96Updated this week
 - A compiler from AI model to RTL (Verilog) accelerator in FPGA hardware with auto design space exploration.☆435Updated 5 years ago