ZJU-REAL / GUI-RCPOLinks
[AAAI 2026] Test-Time Reinforcement Learning for GUI Grounding via Region Consistency https://arxiv.org/abs/2508.05615
☆58Updated 3 months ago
Alternatives and similar repositories for GUI-RCPO
Users that are interested in GUI-RCPO are comparing it to the libraries listed below
Sorting:
- ☆271Updated last week
- [NeurIPS'25] GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents☆376Updated 3 months ago
- [NeurIPS 2025] Let LRMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆55Updated 3 months ago
- ☆148Updated 6 months ago
- Collection of model-centric MCP servers☆25Updated 8 months ago
- ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization☆95Updated 8 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆178Updated 4 months ago
- ☆36Updated 4 months ago
- The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"☆136Updated 5 months ago
- The code and data of We-Math 2.0.☆164Updated 5 months ago
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆94Updated 7 months ago
- ☆33Updated 6 months ago
- Official implementation of UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning☆65Updated last month
- [TMLR 2025] Reading List of Memory Augmented Multimodal Research, including multimodal context modeling, memory in vision and robotics, a…☆57Updated 3 weeks ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆392Updated 5 months ago
- Official Repository for PosterGen☆211Updated this week
- [ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant☆44Updated last year
- ☆32Updated 5 months ago
- Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforce…☆441Updated 3 weeks ago
- This is the code repo for our paper "Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts".☆43Updated 4 months ago
- [ICLR 2026] Efficient Agent Training for Computer Use☆135Updated 5 months ago
- Implementation for OAgents: An Empirical Study of Building Effective Agents☆306Updated 3 months ago
- [MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆147Updated 6 months ago
- ✨✨Latest Papers and Datasets on Mobile and PC GUI Agent☆149Updated last year
- ZeroGUI: Automating Online GUI Learning at Zero Human Cost☆107Updated 6 months ago
- ASTRA is an end-to-end system for synthesizing agentic trajectories and rule-verifiable environments for SFT and RL training, developed b…☆109Updated last week
- EAFT(Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting) official repo☆82Updated 3 weeks ago
- The code and data of We-Math, accepted by ACL 2025 main conference.☆134Updated last month
- Multimodal Deepresearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework☆43Updated 2 weeks ago
- Repo for "MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability"☆148Updated 8 months ago