nqHITSZ / Systolic-ArrayLinks
course design
☆23Updated 7 years ago
Alternatives and similar repositories for Systolic-Array
Users that are interested in Systolic-Array are comparing it to the libraries listed below
Sorting:
- SAURIA (Systolic-Array tensor Unit for aRtificial Intelligence Acceleration) is an open-source Convolutional Neural Network accelerator b…☆71Updated 2 weeks ago
- Template for project1 TPU☆20Updated 4 years ago
- ☆28Updated 6 years ago
- ☆71Updated 7 years ago
- tpu-systolic-array-weight-stationary☆25Updated 4 years ago
- TensorCore Vector Processor for Deep Learning - Google Summer of Code Project☆24Updated 4 years ago
- ☆38Updated 6 years ago
- ☆36Updated 4 years ago
- A verilog implementation for Network-on-Chip☆78Updated 7 years ago
- ☆38Updated 8 months ago
- LCAI-TIHU HW is an AI inference processor which is comprised of RISC-V cpu, nvdla, NoC bus, PCIe module, DDR, SRAM, bootROM, DMA and peri…☆42Updated 2 years ago
- ☆65Updated 7 months ago
- eyeriss-chisel3☆40Updated 3 years ago
- vector multiplication adder accelerator (using chisel 3 and RocketChip RoCC ) 向量乘法累加加速器☆53Updated 5 years ago
- Tutorials on HLS Design☆52Updated 5 years ago
- A toolchain for rapid design space exploration of chiplet architectures☆68Updated 4 months ago
- HLS for Networks-on-Chip☆38Updated 4 years ago
- Public release☆58Updated 6 years ago
- Prototype-network-on-chip (ProNoC) is an EDA tool that facilitates prototyping of custom heterogeneous NoC-based many-core-SoC (MCSoC).☆59Updated 2 weeks ago
- NoC (Network-on-Chip) generator that generates Verilog HDL model of NoC consisting of on-chip routers☆71Updated 5 years ago
- Dadda multiplier(8*8, 16*16, 32*32) in Verilog HDL.☆36Updated last year
- This work implements a dynamic programming algorithm for performing local sequence alignment. Through parallelism, it can run 136X times …☆27Updated 6 years ago
- A Reconfigurable Accelerator for Deep Convolutional Neural Networks Implemented by Chisel3.☆29Updated 4 years ago
- Systolic array implementations for Cholesky, LU, and QR decomposition☆47Updated last year
- Verilog Implementation of 32-bit Floating Point Adder☆44Updated 5 years ago
- The official NaplesPU hardware code repository☆20Updated 6 years ago
- 32 - bit floating point Multiplier Accumulator Unit (MAC)☆33Updated 4 years ago
- ☆31Updated 5 years ago
- Provides the hardware code for the paper "EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerator…☆24Updated 5 years ago
- Transactional Verilog design and Verilator Testbench for a RISC-V TensorCore Vector co-processor for reproducible linear algebra☆60Updated 3 years ago