KastnerRG / cgra4ml
An Open Workflow to Build Custom SoCs and run Deep Models at the Edge
☆69Updated last month
Alternatives and similar repositories for cgra4ml:
Users that are interested in cgra4ml are comparing it to the libraries listed below
- INT8 & FP16 multiplier accumulator (MAC) design with UVM verification completed.☆85Updated 4 years ago
- This is a verilog implementation of 4x4 systolic array multiplier☆42Updated 4 years ago
- IC implementation of Systolic Array for TPU☆172Updated 2 months ago
- 16-bit Adder Multiplier hardware on Digilent Basys 3☆65Updated last year
- Verilog implementation of Softmax function☆54Updated 2 years ago
- CHARM: Composing Heterogeneous Accelerators on Heterogeneous SoC Architecture☆125Updated this week
- Library of approximate arithmetic circuits☆53Updated 2 years ago
- BARVINN: A Barrel RISC-V Neural Network Accelerator: https://barvinn.readthedocs.io/en/latest/☆81Updated last week
- ☆26Updated 5 years ago
- IEEE 754 single and double precision floating point library in systemverilog and vhdl☆61Updated 3 weeks ago
- ☆56Updated 4 years ago
- RapidStream TAPA compiles task-parallel HLS program into high-frequency FPGA accelerators.☆164Updated this week
- 32-Bit Algorithms of Floating Point Operations are implemented on Verilog with logic Operations.☆77Updated 5 years ago
- Synthesizable Floating point unit written using Verilog. Supports 32-bit (Single-Precision) Multiplication, Addition and Division and Squ…☆47Updated 5 months ago
- ☆83Updated 7 months ago
- ☆60Updated 6 years ago
- Vitis HLS Library for FINN☆188Updated last month
- tpu-systolic-array-weight-stationary☆20Updated 3 years ago
- A SystemVerilog implementation of Row-Stationary dataflow and Hierarchical Mesh Network-on-Chip Architecture based on Eyeriss CNN Acceler…☆135Updated 5 years ago
- PYNQ Composabe Overlays☆69Updated 7 months ago
- IC implementation of TPU☆92Updated 5 years ago
- PolyLUT is the first quantized neural network training methodology that maps a neuron to a LUT while using multivariate polynomial functi…☆45Updated 11 months ago
- A Flexible and Energy Efficient Accelerator For Sparse Convolution Neural Network☆39Updated 4 months ago
- ☆71Updated last year
- CGRA-Flow is an integrated framework for CGRA compilation, exploration, synthesis, and development.☆118Updated last month
- An AXI4 crossbar implementation in SystemVerilog☆130Updated last month
- ☆39Updated 4 months ago
- A Fast, Low-Overhead On-chip Network☆155Updated 3 weeks ago
- RTL Network-on-Chip Router Design in SystemVerilog by Andrea Galimberti, Filippo Testa and Alberto Zeni☆121Updated 6 years ago
- ☆33Updated last week