Efficient-ML / Awesome-Model-Quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
☆1,886Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for Awesome-Model-Quantization
- A curated list of neural network pruning resources.☆2,361Updated 7 months ago
- List of papers related to neural network quantization in recent AI conferences and journals.☆458Updated last month
- Model Quantization Benchmark☆765Updated 5 months ago
- Summary, Code for Deep Neural Network Quantization☆531Updated last month
- Awesome machine learning model compression research papers, quantization, tools, and learning material.☆491Updated last month
- Collection of recent methods on (deep) neural network compression and acceleration.☆930Updated 2 months ago
- A simple network quantization demo using pytorch from scratch.☆511Updated last year
- micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantiz…☆2,219Updated 3 years ago
- PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.☆1,558Updated 7 months ago
- [CVPR 2023] DepGraph: Towards Any Structural Pruning☆2,724Updated this week
- Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.☆413Updated last year
- A coding-free framework built on PyTorch for reproducible deep learning studies. 🏆25 knowledge distillation methods presented at CVPR, I…☆1,392Updated last month
- Pytorch implementation of various Knowledge Distillation (KD) methods.☆1,614Updated 2 years ago
- ☆662Updated 3 years ago
- Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。☆2,497Updated last year
- Rethinking the Value of Network Pruning (Pytorch) (ICLR 2019)☆1,510Updated 4 years ago
- AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.☆2,148Updated this week
- [ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment☆1,884Updated 11 months ago
- [ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices☆430Updated 11 months ago
- OpenMMLab Model Compression Toolbox and Benchmark.☆1,479Updated 5 months ago
- PyTorch implementation for the APoT quantization (ICLR 2020)☆268Updated 2 years ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆1,257Updated 4 months ago
- Awesome LLM compression research papers and tools.☆1,202Updated this week
- [IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer☆308Updated last year
- [TMLR 2024] Efficient Large Language Models: A Survey☆1,025Updated last week
- [CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision☆370Updated 3 years ago
- Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distille…☆4,351Updated last year
- knowledge distillation papers☆741Updated last year
- A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.☆336Updated 3 months ago
- A general and accurate MACs / FLOPs profiler for PyTorch models☆571Updated 6 months ago