zslwyuan / Zynq_HLS_DDR_Dataflow_kernel_2mm
This is a project integrating HLS IP and CortexA9 on Zynq. This CPU-FPGA project, for a Matrix Multiplication Dataflow, is implemented with dataflow and DDR3 access with HLS. The Cortex A9 will print the result via UART and check the result by comparing the data with the one from CPU compuation
☆20Updated 5 years ago
Alternatives and similar repositories for Zynq_HLS_DDR_Dataflow_kernel_2mm:
Users that are interested in Zynq_HLS_DDR_Dataflow_kernel_2mm are comparing it to the libraries listed below
- Systolic matrix multiplication kernel implemented on Xilinx PYNQ FPGA board☆12Updated 4 years ago
- The Verilog source code for DRUM approximate multiplier.☆29Updated last year
- ☆29Updated 5 years ago
- A 16-point radix-4 FFT chip, including Verilog codes, netlists and layout. Group project.☆59Updated 6 months ago
- 3×3脉动阵列乘法器☆42Updated 5 years ago
- INT8 & FP16 multiplier accumulator (MAC) design with UVM verification completed.☆89Updated 4 years ago
- ☆62Updated 6 years ago
- Contains FPGA benchmarks for Vivado HLS and Catapult HLS☆26Updated 4 years ago
- CNN-Accelerator based on FPGA developed by verilog HDL.☆45Updated 4 years ago
- A project on hardware design for convolutional neural network. This neural network is of 2 layers with 400 inputs in the first layer. Thi…☆18Updated 7 years ago
- SystemVerilog files for lab project on a DNN hardware accelerator☆16Updated 3 years ago
- LCAI-TIHU HW is an AI inference processor which is comprised of RISC-V cpu, nvdla, NoC bus, PCIe module, DDR, SRAM, bootROM, DMA and peri…☆36Updated 2 years ago
- Hardware accelerator for convolutional neural networks☆36Updated 2 years ago
- This repository contains all the parameters you need to synthesize the AlexNet by using Vivado High Level Synthesis.☆21Updated 7 years ago
- Designing CNN accelerator using a Xilinx FPGA board and comparing performance with CPU.☆21Updated 4 years ago
- tpu-systolic-array-weight-stationary☆20Updated 3 years ago
- 32 - bit floating point Multiplier Accumulator Unit (MAC)☆27Updated 4 years ago
- A verilog implementation for Network-on-Chip☆73Updated 7 years ago
- This is a simple project that shows how to multiply two 3x3 matrixes in Verilog.☆50Updated 7 years ago
- Convolutional Neural Network Using High Level Synthesis☆85Updated 4 years ago
- Low level design of a chip built for optimizing/accelerating CNN classifiers over gray scale images.☆12Updated 5 years ago
- CS533 Course Project (ongoing) - Exploring Parallel Architectures for Neural Processing Unit Implementations☆19Updated 7 years ago
- eyeriss-chisel3☆40Updated 2 years ago
- Systolic array based simple TPU for CNN on PYNQ-Z2☆24Updated 2 years ago
- Verilog Implementation of 32-bit Floating Point Adder☆36Updated 4 years ago
- SoC Based on ARM Cortex-M3☆27Updated last month
- A generic implementation of AMBA AXI4 communication protocol. The design provides a master, a slave and an interconnect with multiple mas…☆31Updated 2 years ago
- Convolutional Neural Network Implemented in Verilog for System on Chip☆26Updated 5 years ago
- IPs for data-plane integration of Hardware Processing Engines (HWPEs) within a PULP system☆19Updated last week
- AXI master to AHB slave, support INCR/WRAP, out of standing, do not advanced feature such as support out of order, retry, split, etc☆36Updated 2 years ago