The code for the paper ROUTERBENCH: A Benchmark for Multi-LLM Routing System
☆164Jun 13, 2024Updated last year
Alternatives and similar repositories for routerbench
Users that are interested in routerbench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Open sourced backend for Martian's LLM Inference Provider Leaderboard☆21Aug 13, 2024Updated last year
- A curated list of awesome approaches to AI model routing☆203Mar 24, 2025Updated last year
- Tutorial for building LLM router☆252Jul 19, 2024Updated last year
- ☆12Feb 11, 2026Updated 3 months ago
- Framework for Cost-Effective Language Model Choice☆16Dec 12, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality☆4,949Aug 10, 2024Updated last year
- FrugalGPT: better quality and lower cost for LLM applications☆259Feb 10, 2025Updated last year
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆30Jul 24, 2025Updated 10 months ago
- Radiantloom Email Assist 7B is an email-assistant large language model fine-tuned from Zephyr-7B-Beta, over a custom-curated dataset of 1…☆14Jan 19, 2024Updated 2 years ago
- Official code for the paper: "Metadata Archaeology"☆19May 10, 2023Updated 3 years ago
- ☆56Jun 26, 2025Updated 11 months ago
- ☆19Dec 31, 2025Updated 4 months ago
- ☆106Dec 6, 2024Updated last year
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆55Apr 18, 2026Updated last month
- ☆59Aug 19, 2025Updated 9 months ago
- ☆22Dec 18, 2024Updated last year
- Distributed multi-agent framework for event-driven, graph-based computation. Elixir/Python, NATS event streaming, modular operator/XCS ar…☆14Mar 25, 2026Updated 2 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Sep 19, 2025Updated 8 months ago
- [ACL'25] Code for ACL'25 paper "IRT-Router: Effective and Interpretable Multi-LLM Routing via Item Response Theory"☆32Feb 19, 2025Updated last year
- Accurate, large-scale, and extensible simulator for LLM inference Systems☆608Jul 25, 2025Updated 10 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆202Mar 7, 2025Updated last year
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆45Oct 1, 2025Updated 7 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Superfast AI decision making and intelligent processing of multi-modal data.☆3,561May 23, 2026Updated last week
- Code for Fooling Contrastive Language-Image Pre-trainined Models with CLIPMasterPrints☆15Jan 25, 2026Updated 4 months ago
- Code for the paper "Learning Step-Size Adaptation in CMA-ES"☆12Mar 24, 2023Updated 3 years ago
- A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal…☆12Sep 16, 2024Updated last year
- Extend existing LLMs way beyond the original training length with constant memory usage, without retraining☆736Apr 10, 2024Updated 2 years ago
- S-LoRA: Serving Thousands of Concurrent LoRA Adapters☆1,913Jan 21, 2024Updated 2 years ago
- Code repo for efficient quantized MoE inference with mixture of low-rank compensators☆36Apr 14, 2025Updated last year
- PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".☆92May 23, 2023Updated 3 years ago
- BenchBench is a Python package to evaluate multi-task benchmarks.☆20Oct 12, 2025Updated 7 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.☆826Jul 15, 2025Updated 10 months ago
- 언어모델을 학습하기 위한 공개 한국어 instruction dataset들을 모아두었습니다.☆19Jul 16, 2023Updated 2 years ago
- Tools for merging pretrained large language models.☆19Jun 12, 2024Updated last year
- ☆23Jul 10, 2023Updated 2 years ago
- State of What Art? A Call for Multi-Prompt LLM Evaluation☆16Apr 10, 2026Updated last month
- Code for the paper: "No Zero-Shot Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance" [NeurI…☆94Apr 29, 2024Updated 2 years ago
- Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training☆1,881May 21, 2026Updated last week