Agent-RRM: Exploring Reasoning Reward Model for Agents
☆55Mar 17, 2026Updated this week
Alternatives and similar repositories for Reagent
Users that are interested in Reagent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Sotopia-RL: Reward Design for Social Intelligence☆47Jan 29, 2026Updated last month
- ☆34Apr 1, 2025Updated 11 months ago
- A python script for downloading huggingface datasets and models.☆20Apr 10, 2025Updated 11 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆28Aug 9, 2025Updated 7 months ago
- [EMNLP25 Main]The official code of "Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval"☆22Mar 11, 2026Updated last week
- official implementation of paper "Process Reward Model with Q-value Rankings"☆66Feb 5, 2025Updated last year
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆40Aug 7, 2025Updated 7 months ago
- [NeurIPS'25] The official code of "PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning"☆30Jan 12, 2026Updated 2 months ago
- Official implementation of "PyVision-RL: Forging Open Agentic Vision Models via RL."☆83Feb 25, 2026Updated 3 weeks ago
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.☆10May 16, 2024Updated last year
- The code of MGCC: Text-based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning☆20Feb 26, 2025Updated last year
- ☆15Jan 24, 2025Updated last year
- Official code for "Rethinking Chain-of-Thought Reasoning for Videos"☆20Dec 14, 2025Updated 3 months ago
- VideoGPA is a self-supervised framework that enhances 3D consistency in Video Diffusion Models.☆42Mar 16, 2026Updated last week
- The raw UserRL repo under construction☆97Sep 25, 2025Updated 5 months ago
- [CVPR 2026] FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection☆25Feb 10, 2026Updated last month
- Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge☆113Jan 30, 2026Updated last month
- ☆11Sep 19, 2025Updated 6 months ago
- [NeurIPS25] RULE: Reinforcement UnLEarning Achieves Forge-retain Pareto Optimality☆20Oct 22, 2025Updated 5 months ago
- Official Repository of paper: "MotionEdit: Benchmarking and Learning Motion-Centric Image Editing"☆62Feb 28, 2026Updated 3 weeks ago
- Text Adventure Learning Environment Suite - Benchmark to evaluate language models on interactive text environments.☆26Feb 18, 2026Updated last month
- Mixture of Lora Experts☆10Apr 7, 2024Updated last year
- ☆14Jul 17, 2025Updated 8 months ago
- ☆19Mar 10, 2025Updated last year
- Unlocking Iterative Reasoning for Any Image Editor☆99Jan 18, 2026Updated 2 months ago
- Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations☆22Dec 24, 2025Updated 2 months ago
- CoMM: Collaborative Multi-Agent, Multi-Reasoning-Path Prompting for Complex Problem Solving (NAACL 2024 Findings))☆16Apr 26, 2024Updated last year
- 自己阅读的多模态对话系统论文(及部分笔记)汇总☆22Jan 5, 2023Updated 3 years ago
- Beyond KV Caching: Shared Attention for Efficient LLMs☆20Jul 19, 2024Updated last year
- [CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice☆74Feb 27, 2026Updated 3 weeks ago
- (ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"☆20May 15, 2025Updated 10 months ago
- Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models☆41Sep 30, 2024Updated last year
- Gated Pretrained Transformer model for robust denoised sequence-to-sequence modelling☆10May 29, 2021Updated 4 years ago
- The official source code for "Boosting LLM Agents with Recursive Contemplation for Effective Deception Handling" (ACL 2024, Findings)☆14Aug 12, 2024Updated last year
- Towards a Mechanistic Understanding of Large Reasoning Models: A Survey of Training, Inference, and Failures☆31Jan 29, 2026Updated last month
- QRHead: Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking☆38Jan 20, 2026Updated 2 months ago
- ☆23Sep 19, 2024Updated last year
- [NeurIPS 2024] "Collaboration! Towards Robust Neural Methods for Routing Problems"☆21Nov 16, 2024Updated last year
- ☆15Apr 26, 2025Updated 10 months ago