Efficient-ML / Awesome-Model-Quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
☆2,063Updated last month
Alternatives and similar repositories for Awesome-Model-Quantization:
Users that are interested in Awesome-Model-Quantization are comparing it to the libraries listed below
- List of papers related to neural network quantization in recent AI conferences and journals.☆597Updated 3 weeks ago
- A curated list of neural network pruning resources.☆2,437Updated last year
- Awesome machine learning model compression research papers, quantization, tools, and learning material.☆510Updated 7 months ago
- Collection of recent methods on (deep) neural network compression and acceleration.☆945Updated 3 weeks ago
- Model Quantization Benchmark☆799Updated this week
- [CVPR 2023] DepGraph: Towards Any Structural Pruning☆2,980Updated last week
- ☆668Updated 3 years ago
- micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantiz…☆2,242Updated last week
- Summary, Code for Deep Neural Network Quantization☆548Updated 6 months ago
- PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.☆1,679Updated last year
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆1,392Updated 9 months ago
- Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.☆431Updated last year
- AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.☆2,282Updated this week
- A simple network quantization demo using pytorch from scratch.☆527Updated last year
- ☆236Updated 8 months ago
- [TMLR 2024] Efficient Large Language Models: A Survey☆1,140Updated 3 weeks ago
- [ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment☆1,908Updated last year
- Awesome LLM compression research papers and tools.☆1,472Updated last week
- Pytorch implementation of various Knowledge Distillation (KD) methods.☆1,683Updated 3 years ago
- OpenMMLab Model Compression Toolbox and Benchmark.☆1,580Updated 10 months ago
- A curated list for Efficient Large Language Models☆1,614Updated last week
- [IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer☆333Updated 2 years ago
- Rethinking the Value of Network Pruning (Pytorch) (ICLR 2019)☆1,515Updated 4 years ago
- A general and accurate MACs / FLOPs profiler for PyTorch models☆604Updated 11 months ago
- [CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision☆382Updated 4 years ago
- PyTorch implementation for the APoT quantization (ICLR 2020)☆271Updated 4 months ago
- A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.☆1,473Updated 2 months ago
- PyTorch library to facilitate development and standardized evaluation of neural network pruning methods.☆429Updated last year
- Flops counter for neural networks in pytorch framework☆2,886Updated 3 months ago
- The calflops is designed to calculate FLOPs、MACs and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer…☆761Updated 9 months ago