dongguanting / ARPOLinks

The official code of “Agentic Reinforced Policy Optimization”, an agentic RL algorithm optimization.

☆482

Alternatives and similar repositories for ARPO

Users that are interested in ARPO are comparing it to the libraries listed below

Sorting:

0russwest0 / Awesome-Agent-RL
☆349Updated last week
qiancheng0 / ToolRL
☆317Updated 2 months ago
dongguanting / Tool-Star
🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning
☆236Updated last week
GAIR-NLP / ToRL
☆271Updated 2 months ago
0russwest0 / Agent-R1
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
☆757Updated last month
ReTool-RL / ReTool
☆187Updated last week
bruno686 / Awesome-Agent-Training
Awesome Agent Training
☆213Updated 2 weeks ago
cmu-l3 / l1
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
☆248Updated 3 months ago
RUCAIBox / R1-Searcher
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
☆623Updated 2 weeks ago
TsinghuaC3I / MARTI
A Framework for LLM-based Multi-Agent Reinforced Training and Inference
☆208Updated this week
ElliottYan / LUFFY
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆282Updated last month
RUCAIBox / Slow_Thinking_with_LLMs
A series of technical report on Slow Thinking with LLM
☆722Updated last week
GAIR-NLP / DeepResearcher
Scaling Deep Research via Reinforcement Learning in Real-world Environments.
☆558Updated 4 months ago
XiaoYee / Awesome_Efficient_LRM_Reasoning
😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond
☆286Updated last week
LightChen233 / Awesome-Long-Chain-of-Thought-Reasoning
Latest Advances on Long Chain-of-Thought Reasoning
☆481Updated last month
TIGER-AI-Lab / verl-tool
A version of verl to support tool use
☆333Updated this week
Gen-Verse / ReasonFlux
ReasonFlux Series - A family of LLM post-training algorithms focusing on data selection, reinforcement learning, and inference scaling
☆481Updated 2 weeks ago
OpenBMB / RLPR
Extrapolating RLVR to General Domains without Verifiers
☆140Updated last week
BytedTsinghua-SIA / MemAgent
A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.
☆605Updated 3 weeks ago
Eclipsess / Awesome-Efficient-Reasoning-LLMs
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
☆574Updated this week
langfengQ / verl-agent
verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in…
☆765Updated this week
eddycmu / demystify-long-cot
☆312Updated 2 months ago
ADaM-BJTU / OpenRFT
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
☆148Updated 7 months ago
thinkwee / AgentsMeetRL
An Awesome List of Reinforcement Learning-based Large Language Agent Works. Collect directly from official code base.
☆279Updated last week
InternLM / POLAR
Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.
☆147Updated last month
CJReinforce / PURE
Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"
☆133Updated last month
ruixin31 / Spurious_Rewards
☆325Updated 3 weeks ago
RAGEN-AI / VAGEN
☆204Updated last week
ADaM-BJTU / AutoCoA
AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…
☆124Updated 5 months ago
CharlesQ9 / Self-Evolving-Agents
☆343Updated 3 weeks ago