tloen / llama-int8

Quantized inference code for LLaMA models
1,051Updated last year

Related projects

Alternatives and complementary repositories for llama-int8