A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
☆2,343Apr 5, 2026Updated this week
Alternatives and similar repositories for Awesome-Model-Quantization
Users that are interested in Awesome-Model-Quantization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- List of papers related to neural network quantization in recent AI conferences and journals.☆814Mar 27, 2025Updated last year
- A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languag…☆205Feb 10, 2025Updated last year
- Model Quantization Benchmark☆863Apr 20, 2025Updated 11 months ago
- Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.☆458May 15, 2023Updated 2 years ago
- Pytorch implementation of BRECQ, ICLR 2021☆296Aug 1, 2021Updated 4 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A curated list of neural network pruning resources.☆2,491Apr 4, 2024Updated 2 years ago
- Summary, Code for Deep Neural Network Quantization☆558Jun 14, 2025Updated 9 months ago
- [IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer☆359Apr 11, 2023Updated 2 years ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆1,631Jul 12, 2024Updated last year
- micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantiz…☆2,269May 6, 2025Updated 11 months ago
- Unofficial implementation of LSQ-Net, a neural network quantization framework☆312May 8, 2024Updated last year
- [CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework☆282Dec 8, 2023Updated 2 years ago
- PyTorch implementation for the APoT quantization (ICLR 2020)☆287Dec 11, 2024Updated last year
- PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.☆1,789Mar 28, 2024Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- The official PyTorch implementation of the ICLR2022 paper, QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quan…☆129Sep 23, 2025Updated 6 months ago
- Awesome LLM compression research papers and tools.☆1,796Feb 23, 2026Updated last month
- Post-Training Quantization for Vision transformers.☆242Jul 19, 2022Updated 3 years ago
- AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.☆2,585Updated this week
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binar…☆56Mar 4, 2024Updated 2 years ago
- Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".☆2,282Mar 27, 2024Updated 2 years ago
- Awesome machine learning model compression research papers, quantization, tools, and learning material.☆543Sep 21, 2024Updated last year
- [CVPR 2020] This project is the PyTorch implementation of our accepted CVPR 2020 paper : forward and backward information retention for a…☆180Mar 14, 2020Updated 6 years ago
- A curated list for Efficient Large Language Models☆1,977Jun 17, 2025Updated 9 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.☆138Apr 28, 2022Updated 3 years ago
- A simple network quantization demo using pytorch from scratch.☆542Jun 18, 2023Updated 2 years ago
- PyTorch implementation of Data Free Quantization Through Weight Equalization and Bias Correction.☆264Oct 3, 2023Updated 2 years ago
- Brevitas: neural network quantization in PyTorch☆1,512Updated this week
- [CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.☆3,278Sep 7, 2025Updated 7 months ago
- [MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Se…☆822Mar 6, 2025Updated last year
- [CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision☆405Feb 26, 2021Updated 5 years ago
- [CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric☆61Mar 23, 2023Updated 3 years ago
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration☆3,488Jul 17, 2025Updated 8 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.☆893Nov 26, 2025Updated 4 months ago
- Reorder-based post-training quantization for large language model☆199May 17, 2023Updated 2 years ago
- BitSplit Post-trining Quantization☆50Dec 20, 2021Updated 4 years ago
- [ICML'21 Oral] I-BERT: Integer-only BERT Quantization☆267Jan 29, 2023Updated 3 years ago
- Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm. In ECCV 2…☆186Mar 28, 2021Updated 5 years ago
- This project is the official implementation of our accepted ICLR 2022 paper BiBERT: Accurate Fully Binarized BERT.☆89Jun 2, 2023Updated 2 years ago
- Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks☆68Nov 4, 2021Updated 4 years ago