HuangCongQing / model-compression-optimizationLinks
model compression and optimization for deployment for Pytorch, including knowledge distillation, quantization and pruning.(知识蒸馏,量化,剪枝)
☆20Updated last year
Alternatives and similar repositories for model-compression-optimization
Users that are interested in model-compression-optimization are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2023] MCUFormer: Deploying Vision Transformers on Microcontrollers with Limited Memory☆75Updated 2 years ago
- provide some new architecture, channel pruning and quantization methods for yolov5☆30Updated 3 months ago
- EQ-Net [ICCV 2023]☆30Updated 2 years ago
- RL-Pruner: Structured Pruning Using Reinforcement Learning for CNN Compression and Acceleration☆27Updated 7 months ago
- The official (TMLR) implementation of LumiNet: Perception-Driven Knowledge Distillation via Statistical Logit Calibration☆17Updated 5 months ago
- ☆49Updated 3 years ago
- ☆36Updated 2 years ago
- Quantize pytorch model, support post-training quantization and quantization aware training methods☆14Updated 2 years ago
- To appear in the 11th International Conference on Learning Representations (ICLR 2023).☆18Updated 2 years ago
- [CVPR 2024] PTQ4SAM: Post-Training Quantization for Segment Anything☆81Updated last year
- The official PyTorch implementation of CHEX: CHannel EXploration for CNN Model Compression (CVPR 2022). Paper is available at https://ope…☆38Updated 3 years ago
- The official implementation of paper PreNAS: Preferred One-Shot Learning Towards Efficient Neural Architecture Search☆31Updated 2 years ago
- DETR tensor去除推理过程无用辅助头+fp16部署再次加速+解决转tensorrt 输出全为0问题的新方法。☆12Updated 2 years ago
- [ICCV-2023] EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization☆28Updated 2 years ago
- [ICCV 2025] Task-Specific Zero-shot Quantization-Aware Training for Object Detection☆26Updated 4 months ago
- YOLOv5 Quantization Aware Training (QAT, qat_torch branch) and Post Training Quantization with ONNX (ptq_onnx branch ptq_onnx.ipynb)☆15Updated 2 years ago
- [CVPRW 2021] Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms☆30Updated 3 years ago
- base quantization methods including: QAT, PTQ, per_channel, per_tensor, dorefa, lsq, adaround, omse, Histogram, bias_correction.etc☆51Updated 3 years ago
- Model Compression 1. Pruning(BN Pruning) 2. Knowledge Distillation (Hinton) 3. Quantization (MNN) 4. Deployment (MNN)☆80Updated 5 years ago
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binar…☆56Updated last year
- Jupyter notebook tutorials for MMDeploy☆38Updated 3 years ago
- TensorRT 2022 亚军方案,tensorrt加速mobilevit模型☆68Updated 3 years ago
- This repository is Onnx tutorial summary for python implements , which comes from other web resource.☆29Updated 3 years ago
- DeiT implementation for Q-ViT☆25Updated 9 months ago
- ☆13Updated last year
- ☆47Updated 2 years ago
- For 2022 Nvidia Hackathon☆22Updated 3 years ago
- YOLO Series☆14Updated 2 years ago
- An onnx-based quantitation tool.☆71Updated 2 years ago
- [ICLR 2022] The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training by Shiwei Liu, Tianlo…☆77Updated 3 years ago