[NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank
☆80Nov 4, 2024Updated last year
Alternatives and similar repositories for vllm-ltr
Users that are interested in vllm-ltr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆90Oct 17, 2025Updated 8 months ago
- Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny BERT model can tell you the verbosity of an …☆51Jun 1, 2024Updated 2 years ago
- ☆21Jun 9, 2025Updated last year
- ☆134Nov 11, 2024Updated last year
- An auxiliary project analysis of the characteristics of KV in DiT Attention.☆34Nov 29, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Efficient Long-context Language Model Training by Core Attention Disaggregation