rayleizhu / vllm-raLinks
[ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts
☆40Updated last year
Alternatives and similar repositories for vllm-ra
Users that are interested in vllm-ra are comparing it to the libraries listed below
Sorting:
- Odysseus: Playground of LLM Sequence Parallelism☆70Updated last year
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆42Updated last week
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry