Zhen-Dong / CoDeNet
[FPGA'21] CoDeNet is an efficient object detection model on PyTorch, with SOTA performance on VOC and COCO based on CenterNet and Co-Designed deformable convolution.
☆25Updated last year
Related projects: ⓘ
- ☆19Updated 3 years ago
- A Out-of-box PyTorch Scaffold for Neural Network Quantization-Aware-Training (QAT) Research. Website: https://github.com/zhutmost/neuralz…☆26Updated last year
- ☆27Updated 4 years ago
- [ICML 2021] "Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators" by Yonggan Fu, Yonga…☆14Updated 2 years ago
- ☆67Updated 4 years ago
- My name is Fang Biao. I'm currently pursuing my Master degree with the college of Computer Science and Engineering, Si Chuan University, …☆41Updated last year
- [HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning☆64Updated 3 weeks ago
- ☆23Updated 2 years ago
- [HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design☆88Updated last year
- ☆18Updated 2 years ago
- [ICASSP'20] DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architecture…☆22Updated last year
- BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization (ICLR 2021)☆36Updated 3 years ago
- DAC System Design Contest 2020☆29Updated 4 years ago
- Official implementation of "Searching for Winograd-aware Quantized Networks" (MLSys'20)☆26Updated 11 months ago
- Linux docker for the DNN accelerator exploration infrastructure composed of Accelergy and Timeloop☆41Updated 3 months ago
- DeiT implementation for Q-ViT☆23Updated 2 years ago
- Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-Experts☆82Updated 4 months ago
- Post-training sparsity-aware quantization☆32Updated last year
- ☆23Updated 6 months ago
- A FPGA-based neural network inference accelerator, which won the third place in DAC-SDC☆28Updated 2 years ago
- Algorithm-hardware Co-design for Deformable Convolution☆24Updated 3 years ago
- HW/SW co-design of sentence-level energy optimizations for latency-aware multi-task NLP inference☆46Updated 5 months ago
- The official PyTorch implementation of the NeurIPS2022 (spotlight) paper, Outlier Suppression: Pushing the Limit of Low-bit Transformer L…☆46Updated last year
- An FPGA Accelerator for Transformer Inference☆69Updated 2 years ago
- Neural Network Quantization With Fractional Bit-widths☆12Updated 3 years ago
- A collection of research papers on efficient training of DNNs☆67Updated 2 years ago
- [TCAD 2021] Block Convolution: Towards Memory-Efficient Inference of Large-Scale CNNs on FPGA☆15Updated 2 years ago
- Approximate layers - TensorFlow extension☆25Updated 4 months ago
- BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.☆49Updated last year
- Code for paper "FuSeConv Fully Separable Convolutions for Fast Inference on Systolic Arrays" published at DATE 2021☆11Updated 3 years ago