[ICLR 2026] Agentic Reinforced Policy Optimization (ARPO)
β916Jan 28, 2026Updated last month
Alternatives and similar repositories for ARPO
Users that are interested in ARPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π§Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learningβ328Jan 3, 2026Updated 2 months ago
- RAG methods, benchmarks, and toolkitsβ19Nov 28, 2024Updated last year
- [ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoningβ362Jan 12, 2026Updated 2 months ago
- β175Feb 24, 2026Updated last month
- β41Updated this week
- β67Aug 14, 2025Updated 7 months ago
- verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-inβ¦β1,697Feb 27, 2026Updated 3 weeks ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replayβ153May 29, 2025Updated 9 months ago
- ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason with Tool Call for LLMs via Reiβ¦β1,349May 16, 2025Updated 10 months ago
- verl: Volcano Engine Reinforcement Learning for LLMsβ20,097Updated this week
- Some example codes for drawing figures in research paperβ35Mar 3, 2022Updated 4 years ago
- π Search-o1: Agentic Search-Enhanced Large Reasoning Models [EMNLP 2025]β1,183Nov 17, 2025Updated 4 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tooβ¦β411Aug 26, 2025Updated 6 months ago
- A version of verl to support diverse tool useβ923Mar 2, 2026Updated 3 weeks ago
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRLβ4,261Nov 13, 2025Updated 4 months ago
- β39Nov 13, 2025Updated 4 months ago
- The demo, code and data of FollowRAGβ76Jun 30, 2025Updated 8 months ago
- OmniGAIA: Towards Native Omni-Modal AI Agentsβ82Mar 16, 2026Updated last week
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.β713Oct 15, 2025Updated 5 months ago
- β182Dec 5, 2025Updated 3 months ago
- An Open-source RL System from ByteDance Seed and Tsinghua AIRβ1,762May 11, 2025Updated 10 months ago
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.β552Sep 8, 2025Updated 6 months ago
- β218Feb 20, 2025Updated last year
- β1,638Jan 20, 2026Updated 2 months ago
- Official repository for ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Useβ29Nov 4, 2025Updated 4 months ago
- β27Jul 18, 2025Updated 8 months ago
- Democratizing Reinforcement Learning for LLMsβ5,259Updated this week
- EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRLβ4,748Mar 10, 2026Updated 2 weeks ago
- HierSearch: A Hierarchical Enterprise Deep Search Framework Integrating Local and Web Searchesβ37Oct 9, 2025Updated 5 months ago
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.β2,553Mar 15, 2026Updated last week
- An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)β9,231Updated this week
- Open source code of the paper: "OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain"β82Dec 20, 2024Updated last year
- An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Modelsβ2,989Updated this week
- Official Repo for Open-Reasoner-Zeroβ2,086Jun 2, 2025Updated 9 months ago
- A series of technical report on Slow Thinking with LLMβ763Aug 13, 2025Updated 7 months ago
- β139Nov 17, 2025Updated 4 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scalingβ186Jul 23, 2025Updated 8 months ago
- RL with Experience Replayβ55Jul 27, 2025Updated 7 months ago
- SSRL: Self-Search Reinforcement Learningβ207Aug 20, 2025Updated 7 months ago