microsoft / LLMLinguaLinks

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
5,660Updated last month

Alternatives and similar repositories for LLMLingua

Users that are interested in LLMLingua are comparing it to the libraries listed below

Sorting: