machilusZ / FastGenLinks

This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
39Updated last year

Alternatives and similar repositories for FastGen

Users that are interested in FastGen are comparing it to the libraries listed below

Sorting: