ritzz-ai / GUI-R1Links
Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
☆200Updated 6 months ago
Alternatives and similar repositories for GUI-R1
Users that are interested in GUI-R1 are comparing it to the libraries listed below
Sorting:
- Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent w…☆84Updated 2 months ago
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆204Updated last month
- [AAAI-2026] Code for "UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning"☆138Updated last week
- A Self-Training Framework for Vision-Language Reasoning☆86Updated 9 months ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆137Updated 5 months ago
- MAT: Multi-modal Agent Tuning 🔥 ICLR 2025 (Spotlight)☆69Updated 5 months ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆95Updated last month
- Repository for the paper "InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners"☆60Updated 5 months ago
- ☆109Updated 2 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆164Updated 5 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆167Updated last month
- Paper collections of multi-modal LLM for Math/STEM/Code.☆129Updated 3 weeks ago
- A RLHF Infrastructure for Vision-Language Models☆186Updated last year
- Official Repository of "Learning what reinforcement learning can't"☆69Updated 2 months ago
- ☆84Updated last year
- An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.☆146Updated last month
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆347Updated 2 months ago
- Extrapolating RLVR to General Domains without Verifiers☆178Updated 3 months ago
- Towards Large Multimodal Models as Visual Foundation Agents☆242Updated 6 months ago
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆318Updated last month
- Training VLM agents with multi-turn reinforcement learning☆304Updated last week
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆381Updated 4 months ago
- [ICCV 2025] GUIOdyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUIOdyssey consists of 8,834 e…☆131Updated 3 months ago
- [ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding☆91Updated 7 months ago
- 😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond☆313Updated 3 weeks ago
- GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning☆167Updated last month
- Paper List of Inference/Test Time Scaling/Computing☆320Updated 2 months ago
- ☆282Updated 4 months ago
- MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources☆208Updated last month
- A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.☆69Updated 8 months ago