tloen / llama-int8

Quantized inference code for LLaMA models
1,051Updated last year

Alternatives and similar repositories for llama-int8:

Users that are interested in llama-int8 are comparing it to the libraries listed below