Efficient-ML / Awesome-Model-QuantizationLinks
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
☆2,306Updated 10 months ago
Alternatives and similar repositories for Awesome-Model-Quantization
Users that are interested in Awesome-Model-Quantization are comparing it to the libraries listed below
Sorting:
- List of papers related to neural network quantization in recent AI conferences and journals.☆777Updated 9 months ago
- A curated list of neural network pruning resources.☆2,484Updated last year
- Awesome machine learning model compression research papers, quantization, tools, and learning material.☆540Updated last year
- Summary, Code for Deep Neural Network Quantization☆559Updated 7 months ago
- Collection of recent methods on (deep) neural network compression and acceleration.☆955Updated 9 months ago
- Model Quantization Benchmark☆855Updated 8 months ago
- AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.☆2,537Updated this week
- Awesome LLM compression research papers and tools.☆1,757Updated 2 months ago
- PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.☆1,775Updated last year
- A curated list for Efficient Large Language Models☆1,929Updated 6 months ago
- A simple network quantization demo using pytorch from scratch.☆541Updated 2 years ago
- [TMLR 2024] Efficient Large Language Models: A Survey☆1,247Updated 6 months ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆1,585Updated last year
- [ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment☆1,938Updated 2 years ago
- Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.☆452Updated 2 years ago
- ☆290Updated last year
- TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.☆862Updated 3 weeks ago
- [CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.☆3,240Updated 4 months ago
- micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantiz…☆2,272Updated 8 months ago
- Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。☆2,647Updated 2 years ago
- Pytorch implementation of various Knowledge Distillation (KD) methods.☆1,738Updated 4 years ago
- Awesome Pruning. ✅ Curated Resources for Neural Network Pruning.☆172Updated last year
- knowledge distillation papers☆767Updated 2 years ago
- [IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer☆355Updated 2 years ago
- A coding-free framework built on PyTorch for reproducible deep learning studies. PyTorch Ecosystem. 🏆26 knowledge distillation methods p…☆1,584Updated 3 weeks ago
- Brevitas: neural network quantization in PyTorch☆1,474Updated this week
- ☆670Updated 4 years ago
- SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, …☆2,565Updated this week
- Unofficial implementation of LSQ-Net, a neural network quantization framework☆309Updated last year
- A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languag…☆203Updated 11 months ago