[TCAD 2021] Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA
☆17Jul 7, 2022Updated 3 years ago
Alternatives and similar repositories for BlockConv
Users that are interested in BlockConv are comparing it to the libraries listed below
Sorting:
- HW/SW co-design of sentence-level energy optimizations for latency-aware multi-task NLP inference☆54Mar 24, 2024Updated last year
- Post-training sparsity-aware quantization☆34Feb 26, 2023Updated 3 years ago
- ☆35Jul 9, 2020Updated 5 years ago
- ☆21Oct 26, 2022Updated 3 years ago
- TQT's pytorch implementation.☆21Dec 17, 2021Updated 4 years ago
- Training Quantized Neural Networks with a Full-precision Auxiliary Module☆13Jun 19, 2020Updated 5 years ago
- The official code for [ECCV2020] "HALO: Hardware-aware Learning to Optimize"☆10Mar 22, 2023Updated 2 years ago
- CNN-Accelerator based on FPGA developed by verilog HDL.☆11Jan 27, 2022Updated 4 years ago
- [ICASSP'20] DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architecture…☆25Oct 1, 2022Updated 3 years ago
- A FPGA-based neural network inference accelerator, which won the third place in DAC-SDC☆28May 11, 2022Updated 3 years ago
- Artifact for IPDPS'21: DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions.☆13Apr 6, 2021Updated 4 years ago
- This repository is an official PyTorch implementation of our paper "Feature Distillation Interaction Weighting Network for Lightweight Im…☆13May 6, 2023Updated 2 years ago
- Neural Network Quantization With Fractional Bit-widths☆11Feb 19, 2021Updated 5 years ago
- An open-sourced PyTorch library for developing energy efficient multiplication-less models and applications.☆14Feb 3, 2025Updated last year
- BitSplit Post-trining Quantization☆50Dec 20, 2021Updated 4 years ago
- Designs for finalist teams of the DAC System Design Contest☆37Jul 8, 2020Updated 5 years ago
- CNN Accelerator in Frequency Domain☆12Feb 22, 2020Updated 6 years ago
- Arrhythmia Detection Using Algorithm and Hardware Co-design for Neural Network Inference Accelerators☆16Jun 5, 2023Updated 2 years ago
- [DATE 2025] Official implementation and dataset of AIrchitect v2: Learning the Hardware Accelerator Design Space through Unified Represen…☆19Jan 17, 2025Updated last year
- [ECCV2022] Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution☆26Jan 12, 2023Updated 3 years ago
- ☆35Mar 1, 2019Updated 6 years ago
- FRAME: Fast Roofline Analytical Modeling and Estimation☆39Oct 13, 2023Updated 2 years ago
- This is a repository of Binary General Matrix Multiply (BGEMM) by customized CUDA kernel. Thank FP6-LLM for the wheels!☆18Aug 30, 2024Updated last year
- Deep Learning Accelerator Based on Eyeriss V2 Architecture with custom RISC-V extended instructions☆207Jun 25, 2020Updated 5 years ago
- Implementation and optimization of matrix multiplication on single CPU (HPC-THU-2023-Autumn)☆18Feb 27, 2024Updated 2 years ago
- AAAI2023 Efficient and Accurate Models towards Practical Deep Learning Baseline☆13Nov 29, 2022Updated 3 years ago
- Adaptive floating-point based numerical format for resilient deep learning☆14Apr 11, 2022Updated 3 years ago
- [TECS'23] A project on the co-design of Accelerators and CNNs.☆21Dec 10, 2022Updated 3 years ago
- Deployment of Deep learning Image Super-Resolution Models in Xilinx Zynq MPSoC ZCU102☆19May 30, 2020Updated 5 years ago
- Open-source of MSD framework☆16Sep 12, 2023Updated 2 years ago
- MaxEVA: Maximizing the Efficiency of Matrix Multiplication on Versal AI Engine (accepted as full paper at FPT'23)☆21Apr 17, 2024Updated last year
- This is the entry project of the Xilinx Adaptive Computing Challenge 2021. It uses YOLOv3 for ship target detection in optical remote sen…☆16May 1, 2022Updated 3 years ago
- This repository provides an FPGA-based solution for executing object detection, focusing specifically on the popular YOLOv5 model archite…☆51Jan 12, 2026Updated last month
- FlexASR: A Reconfigurable Hardware Accelerator for Attention-based Seq-to-Seq Networks☆51Feb 26, 2025Updated last year
- [CVPR 2022] Learnable Lookup Table for Neural Network Quantization☆43Oct 6, 2022Updated 3 years ago
- Provides the code for the paper "EBPC: Extended Bit-Plane Compression for Deep Neural Network Inference and Training Accelerators" by Luk…☆19Oct 6, 2019Updated 6 years ago
- Benchmarking PyTorch 2.0 different models☆20Mar 19, 2023Updated 2 years ago
- BUPT web search home work。基于SML算法的GMM模型。☆17Nov 20, 2019Updated 6 years ago
- MICRO22 artifact evaluation for Sparseloop☆47Aug 8, 2022Updated 3 years ago