tloen / llama-int8

Quantized inference code for LLaMA models
1,052Updated last year

Alternatives and similar repositories for llama-int8:

Users that are interested in llama-int8 are comparing it to the libraries listed below