mit-han-lab / duo-attentionView on GitHub
[ICLR 2025] DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
532Feb 10, 2025Updated last year

Alternatives and similar repositories for duo-attention

Users that are interested in duo-attention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Are these results useful?