Aaronhuang-778 / SliM-LLM
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
☆28Updated 7 months ago
Alternatives and similar repositories for SliM-LLM:
Users that are interested in SliM-LLM are comparing it to the libraries listed below
- Activation-aware Singular Value Decomposition for Compressing Large Language Models☆59Updated 5 months ago
- An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization☆121Updated last month
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆99Updated 3 months ago
- ☆39Updated 8 months ago
- [EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization☆33Updated 6 months ago
- AFPQ code implementation