tongjingqi / Awesome-Agent-RewardLinks
A curated list of awesome resources about reward construction for AI agents. This repository covers cutting-edge research, and practical guides on defining and collecting rewards to build more intelligent and aligned AI agents.
☆34Updated this week
Alternatives and similar repositories for Awesome-Agent-Reward
Users that are interested in Awesome-Agent-Reward are comparing it to the libraries listed below
Sorting:
- Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)☆32Updated last year
- [COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?☆34Updated 3 months ago
- Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"☆105Updated last week
- [COLM'24] Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration☆29Updated 10 months ago
- Generative AI Act II: Test Time Scaling Drives Cognition Engineering☆204Updated 4 months ago
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆87Updated 6 months ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆315Updated last month
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆224Updated 2 weeks ago
- A curated list of papers on LLMs and agents for scientific research and development☆71Updated 8 months ago
- The repository for ACL 2024 paper "TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models"☆31Updated last year
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆95Updated 8 months ago
- The implementation for ICLR 2025 Oral: From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions.☆46Updated 3 weeks ago
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.☆81Updated 6 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆48Updated 9 months ago
- 😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond☆289Updated 3 weeks ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆244Updated 3 weeks ago
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆69Updated 5 months ago
- ☆56Updated 10 months ago
- ☆262Updated last month
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆134Updated last month
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆56Updated 9 months ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆154Updated this week
- Test-time preferenece optimization (ICML 2025).☆162Updated 3 months ago
- AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models, ICLR 2025 (Outstanding Paper)☆314Updated last month
- ☆147Updated 3 months ago
- A method of ensemble learning for heterogeneous large language models.☆60Updated last year
- Latest Advances on Long Chain-of-Thought Reasoning☆492Updated last month
- LLM for Scientific Research Survey☆98Updated 7 months ago
- ☆14Updated last year
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆288Updated last month