dvlab-research / ARPOLinks

Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay

☆140

Alternatives and similar repositories for ARPO

Users that are interested in ARPO are comparing it to the libraries listed below

Sorting:

ltzheng / SimpleTIR
End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆345Updated 3 months ago
OS-Copilot / OS-Genesis
[ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
☆172Updated 2 months ago
mll-lab-nu / VAGEN
Training VLM agents with multi-turn reinforcement learning
☆358Updated 3 weeks ago
MIT-MI / MEM1
☆212Updated 2 months ago
ritzz-ai / GUI-R1
Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
☆208Updated 7 months ago
open-compass / GTA
[NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents
☆131Updated 9 months ago
ruixin31 / Spurious_Rewards
☆345Updated 5 months ago
xlang-ai / OSWorld-G
[NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesis
☆135Updated last month
cmu-l3 / l1
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
☆257Updated 7 months ago
RM-R1-UIUC / RM-R1
RM-R1: Unleashing the Reasoning Potential of Reward Models
☆156Updated 6 months ago
RUC-NLPIR / Tool-Star
🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning
☆298Updated 2 months ago
hkust-nlp / GUIMid
☆21Updated 7 months ago
RUCAIBox / R1-Searcher-plus
R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning
☆67Updated 7 months ago
TsinghuaC3I / Unify-Post-Training
Towards a Unified View of Large Language Model Post-Training
☆197Updated 3 months ago
facebookresearch / sweet_rl
Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks
☆254Updated 7 months ago
THUDM / VisualAgentBench
Towards Large Multimodal Models as Visual Foundation Agents
☆248Updated 8 months ago
EvolvingLMMs-Lab / multimodal-search-r1
MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…
☆372Updated 4 months ago
PRIME-RL / ImplicitPRM
Repo of paper "Free Process Rewards without Process Labels"
☆168Updated 9 months ago
GAIR-NLP / ToRL
☆323Updated 7 months ago
RyanLiu112 / GenPRM
[AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".
☆91Updated last month
THU-KEG / AdaptThink
☆175Updated 3 weeks ago
NVlabs / Tool-N1
☆213Updated 6 months ago
IAAR-Shanghai / xVerify
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
☆143Updated last month
Yan98 / GTA1
☆119Updated 2 months ago
InternLM / OREAL
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
☆190Updated 9 months ago
OpenBMB / RLPR
Extrapolating RLVR to General Domains without Verifiers
☆184Updated 4 months ago
OSU-NLP-Group / WebDreamer
[TMLR'25] "Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents"
☆92Updated 2 months ago
TIGER-AI-Lab / VL-Rethinker
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆170Updated 6 months ago
multimodal-art-projection / REER_DeepWriter
REverse-Engineered Reasoning for Open-Ended Generation
☆84Updated 3 months ago
KANABOON1 / MemGen
MemGen: Weaving Generative Latent Memory for Self-Evolving Agents
☆256Updated last month