Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".
☆26Aug 9, 2025Updated 6 months ago
Alternatives and similar repositories for ARIA
Users that are interested in ARIA are comparing it to the libraries listed below
Sorting:
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Oct 9, 2025Updated 4 months ago
- [ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents☆48Feb 2, 2026Updated last month
- ☆72Jun 10, 2025Updated 8 months ago
- ☆25Aug 19, 2025Updated 6 months ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 8 months ago
- Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning☆14Jun 28, 2025Updated 8 months ago
- The code for ”T-GRAG: A Dynamic GraphRAG Framework for Resolving Temporal Conflicts and Redundancy in Knowledge Retrieval“☆20Jul 30, 2025Updated 7 months ago
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Feb 9, 2026Updated 3 weeks ago
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆12Sep 22, 2025Updated 5 months ago
- ☆26Jan 4, 2026Updated last month
- Aligning Agentic World Models via Knowledgeable Experience Learning☆31Jan 25, 2026Updated last month
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆64Jun 13, 2025Updated 8 months ago
- Source code for our paper: "LoGU: Long-form Generation with Uncertainty Expressions".☆16May 27, 2025Updated 9 months ago
- Bayes-Adaptive RL for LLM Reasoning☆45May 28, 2025Updated 9 months ago
- Code for paper Empowering Large Language Model Agents through Action Learning☆33Aug 8, 2024Updated last year
- [NeurIPS'25] The official code implementation for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Tok…☆78Feb 10, 2026Updated 3 weeks ago
- ☆15Sep 22, 2024Updated last year
- Agent-RRM: Exploring Reasoning Reward Model for Agents☆49Updated this week
- From Word to World: Can Large Language Models be Implicit Text-based World Models?☆48Dec 25, 2025Updated 2 months ago
- [ICLR 2026] Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn Search Agents☆34Feb 1, 2026Updated last month
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model☆65Oct 26, 2025Updated 4 months ago
- Control LLM☆22Apr 6, 2025Updated 10 months ago
- [EMNLP 2025] Code for paper "Table-R1: Inference-Time Scaling for Table Reasoning"☆29Jun 3, 2025Updated 9 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆52Jul 15, 2025Updated 7 months ago
- SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution☆25Nov 11, 2025Updated 3 months ago
- AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference☆20Jan 24, 2025Updated last year
- GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators☆47Dec 23, 2025Updated 2 months ago
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning☆35Aug 28, 2025Updated 6 months ago
- a survey on deep research☆47Sep 9, 2025Updated 5 months ago
- [AAAI 2026] Multimodal Deepresearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework☆45Jan 25, 2026Updated last month
- ☆19Mar 10, 2025Updated 11 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆50Feb 4, 2026Updated 3 weeks ago
- Official Code for "Coser: Coordinating LLM-Based Persona Simulation of Established Roles"☆175Dec 25, 2025Updated 2 months ago
- ☆21Aug 30, 2025Updated 6 months ago
- ☆31Sep 12, 2025Updated 5 months ago
- MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning☆42Sep 3, 2025Updated 6 months ago
- The official implementation of Cross-Task Experience Sharing (COPS)☆29Oct 23, 2024Updated last year
- Official code implementation for the ACL 2025 paper: 'Dynamic Scaling of Unit Tests for Code Reward Modeling'☆27May 16, 2025Updated 9 months ago
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆21Apr 2, 2024Updated last year