withmartian/routerbench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/withmartian/routerbench)

withmartian / routerbench

The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing System

☆172

Alternatives and similar repositories for routerbench

Users that are interested in routerbench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

withmartian / leaderboard-backend
View on GitHub
Open sourced backend for Martian's LLM Inference Provider Leaderboard
☆21Aug 13, 2024Updated last year
anyscale / llm-router
View on GitHub
Tutorial for building LLM router
☆255Jul 19, 2024Updated 2 years ago
r-three / realistic_evaluation_of_model_merging_for_compositional_generalization
View on GitHub
☆13Feb 11, 2026Updated 5 months ago
lamini-ai / llm-routing-agent
View on GitHub
Agent that routes to different tools - LLM classifier SDK
☆46Jun 25, 2024Updated 2 years ago
epfl-dlab / forc
View on GitHub
Framework for Cost-Effective Language Model Choice
☆16Dec 12, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
Not-Diamond / awesome-ai-model-routing
View on GitHub
A curated list of awesome approaches to AI model routing
☆233Mar 24, 2025Updated last year
lm-sys / RouteLLM
View on GitHub
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality
☆5,270Aug 10, 2024Updated last year
ulab-uiuc / Router-R1
View on GitHub
[NeurIPS'25] Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning
☆148Dec 30, 2025Updated 6 months ago
stanford-futuredata / FrugalGPT
View on GitHub
FrugalGPT: better quality and lower cost for LLM applications
☆275Feb 10, 2025Updated last year
aigeek0x0 / radiantloom-email-assist-7b
View on GitHub
Radiantloom Email Assist 7B is an email-assistant large language model fine-tuned from Zephyr-7B-Beta, over a custom-curated dataset of 1…
☆14Jan 19, 2024Updated 2 years ago
richardzhuang0412 / EmbedLLM
View on GitHub
Repo for EmbedLLM: Learning Compact Representations of Large Language Models
☆32Sep 25, 2025Updated 10 months ago
shoaibahmed / metadata_archaeology
View on GitHub
Official code for the paper: "Metadata Archaeology"
☆19May 10, 2023Updated 3 years ago
Not-Diamond / RoRF
View on GitHub
Routing on Random Forest (RoRF)
☆245Sep 24, 2024Updated last year
limenlp / safer-instruct
View on GitHub
This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"
☆17Feb 22, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
pixeli99 / MixLN
View on GitHub
[ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…
☆30Jul 24, 2025Updated last year
padas-lab-de / ir-rag-sigir24-persona-rag
View on GitHub
☆55Jun 23, 2026Updated last month
IlyasMoutawwakil / py-txi
View on GitHub
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
☆32Sep 19, 2025Updated 10 months ago
awslabs / extending-the-context-length-of-open-source-llms
View on GitHub
☆56Jun 26, 2025Updated last year
ScalingIntelligence / Archon
View on GitHub
Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.
☆207Mar 7, 2025Updated last year
Open-Source-O1 / o1_Reasoning_Patterns_Study
View on GitHub
☆105Dec 6, 2024Updated last year
socialfoundations / benchbench
View on GitHub
BenchBench is a Python package to evaluate multi-task benchmarks.
☆23Oct 12, 2025Updated 9 months ago
r-three / RAD
View on GitHub
Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
☆45Oct 1, 2025Updated 9 months ago
v-prgmr / mergekit
View on GitHub
Tools for merging pretrained large language models.
☆19Jun 12, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
doubleshow / superlinked
View on GitHub
A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal…
☆12Sep 16, 2024Updated last year
tomaarsen / attention_sinks
View on GitHub
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
☆735Apr 10, 2024Updated 2 years ago
S-LoRA / S-LoRA
View on GitHub
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
☆1,920Jan 21, 2024Updated 2 years ago
zhengzangw / Sequence-Scheduling
View on GitHub
PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
☆93May 23, 2023Updated 3 years ago
riiid / PPAP
View on GitHub
Official pytorch implementation of "Towards Practical Plug-and-Play Diffusion Models" in CVPR2023
☆22Jul 22, 2023Updated 3 years ago
hahnyuan / PB-LLM
View on GitHub
PB-LLM: Partially Binarized Large Language Models
☆158Nov 20, 2023Updated 2 years ago
pacman100 / peft-codegen-25
View on GitHub
☆23Jul 10, 2023Updated 3 years ago
BatsResearch / bonito
View on GitHub
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
☆831Jul 15, 2025Updated last year
MrBananaHuman / open-korean-instructions
View on GitHub
언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.
☆19Jul 16, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
SLAB-NLP / Multi-Prompt-LLM-Evaluation
View on GitHub
State of What Art? A Call for Multi-Prompt LLM Evaluation
☆16Apr 10, 2026Updated 3 months ago
ZIZUN / RADCoT
View on GitHub
Code for "RADCoT: Retrieval-Augmented Distillation to Specialization Models for Generating Chain-of-Thoughts in Query Expansion", LREC-CO…
☆11May 25, 2024Updated 2 years ago
LLM360 / TxT360
View on GitHub
☆25Dec 18, 2024Updated last year
gigagenie / ginside-sdk
View on GitHub
GiGA Genie INSIDE(G-INSIDE) SDK
☆11Jul 31, 2024Updated last year
DiT-Serving / TetriServe
View on GitHub
[ASPLOS' 26] TetriServe: Efficiently Serving Mixed DiT Workloads
☆17Mar 12, 2026Updated 4 months ago
QuixiAI / VibeLogger
View on GitHub
☆19Dec 31, 2025Updated 6 months ago
flexflow / flexflow-train
View on GitHub
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
☆1,898Updated this week