KastnerRG / cgra4ml
An Open Workflow to Build Custom SoCs and run Deep Models at the Edge
☆75Updated last month
Alternatives and similar repositories for cgra4ml:
Users that are interested in cgra4ml are comparing it to the libraries listed below
- This is a verilog implementation of 4x4 systolic array multiplier☆47Updated 4 years ago
- ☆87Updated 9 months ago
- ☆63Updated 6 years ago
- Synthesizable Floating point unit written using Verilog. Supports 32-bit (Single-Precision) Multiplication, Addition and Division and Squ…☆50Updated 7 months ago
- INT8 & FP16 multiplier accumulator (MAC) design with UVM verification completed.☆93Updated 4 years ago
- ☆57Updated 4 years ago
- PYNQ Composabe Overlays☆70Updated 9 months ago
- 16-bit Adder Multiplier hardware on Digilent Basys 3☆70Updated last year
- RapidStream TAPA compiles task-parallel HLS program into high-frequency FPGA accelerators.☆165Updated this week
- Verilog implementation of Softmax function☆59Updated 2 years ago
- ☆40Updated 3 months ago
- IEEE 754 single and double precision floating point library in systemverilog and vhdl☆62Updated 3 months ago
- Vitis HLS Library for FINN☆191Updated last week
- CHARM: Composing Heterogeneous Accelerators on Heterogeneous SoC Architecture☆131Updated this week
- Quantized ResNet50 Dataflow Acceleration on Alveo, with PYNQ☆57Updated 3 years ago
- ☆41Updated 6 months ago
- ☆57Updated last year
- [FPGA 2022, Best Paper Award] Parallel placement and routing of Vivado HLS dataflow designs.☆121Updated 2 years ago
- IC implementation of TPU☆112Updated 5 years ago
- ☆87Updated last year
- PolyLUT is the first quantized neural network training methodology that maps a neuron to a LUT while using multivariate polynomial functi…☆50Updated last year
- An AXI4 crossbar implementation in SystemVerilog☆138Updated last month
- ☆31Updated 5 years ago
- RaveNoC is a configurable HDL NoC (Network-On-Chip) suitable for MPSoCs and different MP applications☆163Updated 4 months ago
- PyTorch model to RTL flow for low latency inference☆126Updated last year
- HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and Beyond☆32Updated last week
- Library of approximate arithmetic circuits☆54Updated 2 years ago
- Systolic array based simple TPU for CNN on PYNQ-Z2☆28Updated 2 years ago
- 32-Bit Algorithms of Floating Point Operations are implemented on Verilog with logic Operations.☆82Updated 5 years ago
- BARVINN: A Barrel RISC-V Neural Network Accelerator: https://barvinn.readthedocs.io/en/latest/☆85Updated 2 months ago