mit-han-lab / qserve

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
443Updated last week

Related projects

Alternatives and complementary repositories for qserve