RouteWorks / RouterArenaLinks
RouterArena: An open framework for evaluating LLM routers with standardized datasets, metrics, an automated framework, and a live leaderboard.
☆61Updated this week
Alternatives and similar repositories for RouterArena
Users that are interested in RouterArena are comparing it to the libraries listed below
Sorting:
- Dynamic Context Selection for Efficient Long-Context LLMs☆54Updated 8 months ago
- [NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.☆218Updated 8 months ago
- ☆110Updated 4 months ago
- dInfer: An Efficient Inference Framework for Diffusion Language Models☆410Updated last month
- ☆77Updated last week
- ☆36Updated 11 months ago
- [NeurIPS'25 Oral] Query-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)☆196Updated 2 weeks ago
- Easy, Fast, and Scalable Multimodal AI☆106Updated last week
- [CoLM'25] The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>☆155Updated 3 weeks ago
- [NeurIPS 2025] Scaling Speculative Decoding with Lookahead Reasoning☆63Updated 3 months ago
- AI-Driven Research Systems (ADRS)☆117Updated last month
- [ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation☆248Updated last year
- ☆75Updated 7 months ago
- QeRL enables RL for 32B LLMs on a single H100 GPU.☆481Updated 2 months ago
- 🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation…☆116Updated 2 months ago
- ☆50Updated 4 months ago
- Defeating the Training-Inference Mismatch via FP16☆181Updated 2 months ago
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆104Updated 4 months ago
- MrlX: A Multi-Agent Reinforcement Learning Framework☆189Updated 2 weeks ago
- ☆64Updated 8 months ago
- Data Synthesis for Deep Research Based on Semi-Structured Data☆197Updated last month
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆227Updated 3 months ago
- Kinetics: Rethinking Test-Time Scaling Laws☆85Updated 6 months ago
- ☆221Updated 2 months ago
- ☆47Updated 9 months ago
- Easy and Efficient dLLM Fine-Tuning☆208Updated 2 weeks ago
- (ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation☆34Updated 8 months ago
- ☆90Updated 7 months ago
- Block Diffusion for Ultra-Fast Speculative Decoding☆459Updated this week
- ☆129Updated 8 months ago