xuyuzhuang11 / OneBit
The homepage of OneBit model quantization framework.
☆169Updated 2 weeks ago
Alternatives and similar repositories for OneBit:
Users that are interested in OneBit are comparing it to the libraries listed below
- EfficientQAT: Efficient Quantization-Aware Training for Large Language Models☆246Updated 4 months ago
- [ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs☆206Updated last month
- [NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization☆332Updated 6 months ago
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆145Updated 8 months ago
- The official implementation of the EMNLP 2023 paper LLM-FP4☆184Updated last year
- PB-LLM: Partially Binarized Large Language Models