ChenMnZ / PrefixQuantLinks

An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
136Updated last week

Alternatives and similar repositories for PrefixQuant

Users that are interested in PrefixQuant are comparing it to the libraries listed below

Sorting: