mit-han-lab / Quest

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
242Updated 2 months ago

Alternatives and similar repositories for Quest:

Users that are interested in Quest are comparing it to the libraries listed below