AniZpZ / smoothquantLinks
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
☆11Updated 2 years ago
Alternatives and similar repositories for smoothquant
Users that are interested in smoothquant are comparing it to the libraries listed below
Sorting:
- ☆122Updated last year
- ☆206Updated 7 months ago
- An easy-to-use package for implementing SmoothQuant for LLMs☆110Updated 8 months ago
- Easy and Efficient Quantization for Transformers☆202Updated 6 months ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆23Updated last year
- Boosting 4-bit inference kernels with 2:4 Sparsity☆89Updated last year
- Benchmark suite for LLMs from Fireworks.ai☆84Updated last month
- ☆97Updated 8 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆267Updated 3 weeks ago
- A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ/VPTQ, and export to onnx/onnx-runtime easily.