ulab-uiuc / Router-R1Links
Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning
☆28Updated 3 weeks ago
Alternatives and similar repositories for Router-R1
Users that are interested in Router-R1 are comparing it to the libraries listed below
Sorting:
- ☆66Updated 3 months ago
- ☆47Updated this week
- AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents☆24Updated 3 weeks ago
- Resa: Transparent Reasoning Models via SAEs☆39Updated last month
- ☆19Updated 4 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆19Updated 3 weeks ago
- ☆23Updated 3 weeks ago
- ☆56Updated 7 months ago
- Lottery Ticket Adaptation☆39Updated 7 months ago
- The official repo for “Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem”☆18Updated last month
- ☆24Updated 9 months ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆20Updated 2 months ago
- ☆36Updated last month
- ☆47Updated 9 months ago
- Official code repository for the paper "ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind"☆12Updated last month
- XmodelLM☆39Updated 7 months ago
- ☆33Updated this week
- The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search☆45Updated last week
- ☆45Updated last month
- Official Repository for Task-Circuit Quantization☆20Updated last month
- Official Repo for RuleReasoner.☆24Updated last month
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆95Updated last month
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆51Updated 7 months ago
- ☆24Updated 3 weeks ago
- ☆52Updated last week
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆35Updated this week
- Official implementation of ECCV24 paper: POA☆24Updated 11 months ago
- ☆13Updated 7 months ago
- [ACL 2025] Agentic Knowledgeable Self-awareness☆73Updated 3 weeks ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆63Updated last month