Efficient LLM query routing via multi-sampling. BEST-Route selects both model and number of responses based on query difficulty, cutting costs by up to 60% with <1% performance drop. From the paper//arxiv.org/abs/2506.22716
☆58Apr 8, 2026Updated 2 months ago
Alternatives and similar repositories for best-route-llm
Users that are interested in best-route-llm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆22May 23, 2025Updated last year
- ☆35Feb 17, 2026Updated 3 months ago
- [ACL 2025 Main] (🏆 Outstanding Paper Award) Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Proba…☆18Aug 15, 2025Updated 10 months ago
- The wafer-native AI accelerator simulation platform and inference engine.☆55Jan 1, 2026Updated 5 months ago
- The code of RouterDC☆75Apr 14, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- FrugalGPT: better quality and lower cost for LLM applications☆262Feb 10, 2025Updated last year
- ☆15Dec 22, 2022Updated 3 years ago
- A class for synchronizing sensor readings to the system clock☆11Oct 25, 2018Updated 7 years ago
- Code for SIGKDD2025 paper: An Efficient Diffusion-based Non-Autoregressive Solver for Traveling Salesman Problem☆14Jan 28, 2025Updated last year
- Public code release for the paper "Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training"☆11Oct 27, 2025Updated 7 months ago
- ☆12Apr 23, 2026Updated last month
- ☆16Apr 13, 2024Updated 2 years ago
- A platform that provides users with easy access to AI services developed by Montimage and usage of explainable AI techniques (e.g., LIME,…☆10Feb 17, 2026Updated 3 months ago
- An evaluation framework for data center traffic engineering.☆14Jul 28, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for the paper "Age of Information Analysis in Edge Computing Servers"☆22Feb 12, 2024Updated 2 years ago
- [ICML 2024] Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models☆24Sep 12, 2024Updated last year
- SplitBud is a Split Learning framework built upon Flower☆14Mar 22, 2025Updated last year
- ☆11Jun 24, 2021Updated 4 years ago
- ☆12Mar 13, 2023Updated 3 years ago
- Survey on LLM Inference via Search (TMLR 2025)☆15May 6, 2025Updated last year
- LGDCloudSim is a resource management simulation system for large-scale geographically distributed cloud data center scenarios.☆16Mar 6, 2026Updated 3 months ago
- Code for the paper 'Energy Efficiency in Reinforcement Learning for Wireless Sensor Networks'☆25Sep 26, 2018Updated 7 years ago
- Implementation of self-certainty as an extention of ZeroEval Project☆36May 31, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Introducing: Large Scale Capacity Consensus!☆14Nov 1, 2021Updated 4 years ago
- ☆16Apr 30, 2026Updated last month
- Official repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines☆19Dec 8, 2023Updated 2 years ago
- ☆16Feb 10, 2023Updated 3 years ago
- A Better Way to Attend: Attention with Trees for Video Question Answering☆25Mar 25, 2019Updated 7 years ago
- Official resporitory for "IPDPS' 24 QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices".☆20Feb 23, 2024Updated 2 years ago
- The official repository of ICCV 2025 paper "CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning".☆20Nov 26, 2025Updated 6 months ago
- Mixture of Expert (MoE) techniques for enhancing LLM performance through expert-driven prompt mapping and adapter combinations.☆12Feb 11, 2024Updated 2 years ago
- Burstable Cloud Scheduler☆17Jun 6, 2024Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- [ICML 2025] X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP☆46Feb 3, 2026Updated 4 months ago
- This is the implementation for the paper "LARGE LANGUAGE MODEL CASCADES WITH MIX- TURE OF THOUGHT REPRESENTATIONS FOR COST- EFFICIENT REA…☆31Jun 1, 2024Updated 2 years ago
- A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.☆12Oct 12, 2018Updated 7 years ago
- LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure☆304Updated this week
- [DAI 2025] Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing☆211Dec 11, 2025Updated 6 months ago
- Mitigating Routing Update Overhead for Traffic Engineering by Combining Destination-based Routing with Reinforcement Learning☆15Oct 16, 2022Updated 3 years ago
- A comprehensive overview of Data Distillation and Condensation (DDC). DDC is a data-centric task where a representative (i.e., small but …☆13Dec 1, 2022Updated 3 years ago