InfiXAI / InfiGUI-R1Links

Repository for the paper "InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners"

☆60

Alternatives and similar repositories for InfiGUI-R1

Users that are interested in InfiGUI-R1 are comparing it to the libraries listed below

Sorting:

lll6gg / UI-R1
[AAAI-2026] Code for "UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning"
☆138Updated last week
ritzz-ai / GUI-R1
Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
☆200Updated 6 months ago
OpenGVLab / GUI-Odyssey
[ICCV 2025] GUIOdyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUIOdyssey consists of 8,834 e…
☆131Updated 3 months ago
IMNearth / CoAT
Official implementation for "Android in the Zoo: Chain-of-Action-Thought for GUI Agents" (Findings of EMNLP 2024)
☆95Updated last year
open-compass / MMBench-GUI
Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent w…
☆84Updated 2 months ago
njucckevin / MM-Self-Improve
A Self-Training Framework for Vision-Language Reasoning
☆86Updated 9 months ago
YuxiangChai / AMEX-codebase
☆32Updated last year
RUCBM / GUICourse
GUICourse: From General Vision Langauge Models to Versatile GUI Agents
☆133Updated last year
OS-Copilot / OS-Genesis
[ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
☆167Updated last month
OpenGVLab / ZeroGUI
ZeroGUI: Automating Online GUI Learning at Zero Human Cost
☆100Updated 3 months ago
aialt / awesome-mobile-agents
✨✨Latest Papers and Datasets on Mobile and PC GUI Agent
☆140Updated 11 months ago
THUDM / VisualAgentBench
Towards Large Multimodal Models as Visual Foundation Agents
☆242Updated 6 months ago
dvlab-research / ARPO
Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay
☆137Updated 5 months ago
RifleZhang / LLaVA-Reasoner-DPO
☆99Updated 10 months ago
ai-agents-2030 / SPA-Bench
SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
☆49Updated 4 months ago
mat-agent / MAT-Agent
MAT: Multi-modal Agent Tuning 🔥 ICLR 2025 (Spotlight)
☆69Updated 4 months ago
Open-Reasoner-Zero / Open-Vision-Reasoner
[NeurIPS 2025] The official repository for our paper, "Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reason…
☆144Updated 2 months ago
AMAP-ML / GPG
GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning
☆167Updated last month
RUCAIBox / Virgo
Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*
☆109Updated 5 months ago
UITron-hub / UItron
☆62Updated 2 months ago
TongUI-agent / TongUI-agent
Release of code, datasets and model for our work TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials
☆56Updated 3 weeks ago
HJYao00 / Awesome-Agentic-MLLMs
Agentic MLLMs
☆77Updated 3 weeks ago
EvolvingLMMs-Lab / multimodal-search-r1
MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…
☆347Updated 2 months ago
Liuziyu77 / MMDU
Official repository of MMDU dataset
☆97Updated last year
TideDra / VL-RLHF
A RLHF Infrastructure for Vision-Language Models
☆186Updated last year
NUS-TRAIL / NoisyRollout
[NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
☆95Updated last month
OpenGVLab / MM-NIAH
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…
☆117Updated 11 months ago
chenllliang / G1
G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
☆88Updated 5 months ago
Kwai-YuanQi / MM-RLHF
The Next Step Forward in Multimodal LLM Alignment
☆185Updated 6 months ago
TIGER-AI-Lab / VL-Rethinker
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆164Updated 5 months ago