rayleizhu / vllm-ra
View external linksLinks

[ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts
40Feb 29, 2024Updated last year

Alternatives and similar repositories for vllm-ra

Users that are interested in vllm-ra are comparing it to the libraries listed below

Sorting:

Are these results useful?