dvlab-research / ARPOLinks
Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay
☆99Updated 2 months ago
Alternatives and similar repositories for ARPO
Users that are interested in ARPO are comparing it to the libraries listed below
Sorting:
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆139Updated 8 months ago
- Scaling Computer-Use Grounding via UI Decomposition and Synthesis☆91Updated last month
- ☆52Updated last month
- ☆195Updated this week
- Efficient Agent Training for Computer Use☆120Updated last month
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆158Updated this week
- General Reasoner: Advancing LLM Reasoning Across All Domains☆156Updated last month
- Natural Language Reinforcement Learning☆92Updated this week
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control☆59Updated 6 months ago
- ☆50Updated last month
- ☆75Updated 2 weeks ago
- ☆21Updated 3 months ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆118Updated last month
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆58Updated 9 months ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆87Updated 3 months ago
- ☆75Updated last week
- Resources for the Enigmata Project.☆58Updated last month
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆82Updated 2 months ago
- ☆322Updated this week
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆105Updated 2 months ago
- A repo for open research on building large reasoning models☆84Updated this week
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆49Updated 2 months ago
- ☆107Updated 3 months ago
- ☆84Updated last week
- RL Scaling and Test-Time Scaling (ICML'25)☆109Updated 6 months ago
- The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond☆160Updated 3 weeks ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆146Updated 9 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆114Updated 4 months ago
- An Illusion of Progress? Assessing the Current State of Web Agents☆74Updated last week
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆159Updated last week