KastnerRG / cgra4ml
An Open Workflow to Build Custom SoCs and run Deep Models at the Edge
☆64Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for cgra4ml
- Verilog implementation of Softmax function☆47Updated 2 years ago
- IC implementation of Systolic Array for TPU☆148Updated 2 weeks ago
- CHARM: Composing Heterogeneous Accelerators on Versal ACAP Architecture☆123Updated this week
- IC implementation of TPU☆86Updated 4 years ago
- ☆60Updated 5 years ago
- ☆82Updated 4 months ago
- 16-bit Adder Multiplier hardware on Digilent Basys 3☆63Updated last year
- Vitis HLS Library for FINN☆178Updated 2 weeks ago
- A SystemVerilog implementation of Row-Stationary dataflow and Hierarchical Mesh Network-on-Chip Architecture based on Eyeriss CNN Acceler…☆128Updated 4 years ago
- INT8 & FP16 multiplier accumulator (MAC) design with UVM verification completed.☆81Updated 4 years ago
- ☆93Updated 4 years ago
- PolyLUT is the first quantized neural network training methodology that maps a neuron to a LUT while using multivariate polynomial functi…☆39Updated 9 months ago
- An FPGA accelerator for general-purpose Sparse-Matrix Dense-Matrix Multiplication (SpMM).☆66Updated 3 months ago
- IEEE 754 single and double precision floating point library in systemverilog and vhdl☆57Updated 3 weeks ago
- PyTorch model to RTL flow for low latency inference☆121Updated 7 months ago
- eyeriss-chisel3☆38Updated 2 years ago
- HW Architecture-Mapping Design Space Exploration Framework for Deep Learning Accelerators☆113Updated this week
- PYNQ Composabe Overlays☆67Updated 4 months ago
- FREE TPU V3plus for FPGA is the free version of a commercial AI processor (EEP-TPU) for Deep Learning EDGE Inference☆108Updated last year
- Hardware accelerator for convolutional neural networks☆26Updated 2 years ago
- Deep Learning Accelerator Based on Eyeriss V2 Architecture with custom RISC-V extended instructions☆174Updated 4 years ago
- The codes and artifacts associated with our MICRO'22 paper titled: "Adaptable Butterfly Accelerator for Attention-based NNs via Hardware …☆110Updated last year
- ☆26Updated 5 years ago
- ☆69Updated last year
- tpu-systolic-array-weight-stationary☆18Updated 3 years ago
- RapidStream TAPA compiles task-parallel HLS program into high-frequency FPGA accelerators.☆155Updated this week
- Library of approximate arithmetic circuits☆49Updated 2 years ago
- Convolutional accelerator kernel, target ASIC & FPGA☆162Updated last year
- ☆55Updated 4 years ago
- A collection of tutorials for the fpgaConvNet framework.☆30Updated last month