machilusZ / FastGen

This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
32Updated 6 months ago

Alternatives and similar repositories for FastGen:

Users that are interested in FastGen are comparing it to the libraries listed below