open-compass/MMBench-GUI

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/open-compass/MMBench-GUI)

open-compass / MMBench-GUI

Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent with a hierarchical manner across multiple platforms, including Windows, Linux, macOS, iOS, Android and Web.

☆112

Alternatives and similar repositories for MMBench-GUI

Users that are interested in MMBench-GUI are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OpenGVLab / ZeroGUI
View on GitHub
ZeroGUI: Automating Online GUI Learning at Zero Human Cost
☆119Jul 17, 2025Updated last year
VeriGUI-Team / VeriWeb
View on GitHub
VeriWeb: Verifiable Long-Chain Web Benchmark for Agentic Information-Seeking
☆88Jan 21, 2026Updated 6 months ago
OpenGVLab / NaViL
View on GitHub
☆94Oct 10, 2025Updated 9 months ago
hkust-nlp / GUIMid
View on GitHub
☆22May 3, 2025Updated last year
Yan98 / GTA1
View on GitHub
☆130Oct 3, 2025Updated 9 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
JIA-Lab-research / ARPO
View on GitHub
Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay
☆162May 29, 2025Updated last year
tiangeluo / RegionFocus
View on GitHub
A simple visual test-time scaling method for GUI agent grounding
☆26Dec 7, 2025Updated 7 months ago
OS-Copilot / OS-Symphony
View on GitHub
[ACL 2026 Main] Official repository for paper: OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agents
☆47Apr 7, 2026Updated 3 months ago
OpenGVLab / SDLM
View on GitHub
Sequential Diffusion Language Model (SDLM) enhances pre-trained autoregressive language models by adaptively determining generation lengt…
☆98Dec 27, 2025Updated 6 months ago
WebChoreArena / WebChoreArena
View on GitHub
COLM2026
☆36Jul 9, 2026Updated last week
OS-Copilot / OS-Genesis
View on GitHub
[ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
☆188Oct 8, 2025Updated 9 months ago
Tongyi-MAI / MobileWorld
View on GitHub
Benchmarking Autonomous Mobile Agents in Agent-User Interactive and MCP-Augmented Environments (ACL 2026)
☆240Jul 2, 2026Updated 2 weeks ago
OpenGVLab / Vlaser
View on GitHub
Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning
☆49Mar 18, 2026Updated 4 months ago
OpenGVLab / PVC
View on GitHub
[CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
☆54Jun 12, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
penghao-wu / GUI_Reflection
View on GitHub
☆34Sep 19, 2025Updated 10 months ago
open-compass / GenEditEvalKit
View on GitHub
The first unified, efficient, and extensible evaluation toolkit for evaluating image generation and editing models across multiple benchm…
☆50Apr 12, 2026Updated 3 months ago
UITron-hub / UItron
View on GitHub
☆67Sep 6, 2025Updated 10 months ago
OSU-NLP-Group / Explorer
View on GitHub
[ACL'25 (Findings)] Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
☆29Feb 17, 2026Updated 5 months ago
xlang-ai / aguvis
View on GitHub
[ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
☆389Mar 7, 2025Updated last year
OpenGVLab / InternVL-U
View on GitHub
InternVL-U is a 4B-parameter unified multimodal model (UMM) that brings multimodal understanding, reasoning, image generation, image edit…
☆291Mar 21, 2026Updated 3 months ago
uivision / UI-Vision
View on GitHub
☆32Jul 3, 2025Updated last year
njucckevin / SeeClick
View on GitHub
The model, data and code for the visual GUI Agent SeeClick
☆490Jul 13, 2025Updated last year
Euphoria16 / UI-Genie
View on GitHub
[NeurIPS 2025] UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents
☆60Nov 27, 2025Updated 7 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
showlab / Awesome-GUI-Agent
View on GitHub
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
☆1,197Aug 17, 2025Updated 11 months ago
showlab / macosworld
View on GitHub
☆35Jan 28, 2026Updated 5 months ago
OSU-NLP-Group / GUI-Agents-Paper-List
View on GitHub
Awesome GUI Agent Paper List
☆861Jun 28, 2026Updated 3 weeks ago
open-compass / TextEdit
View on GitHub
We provide TextEdit, a high-quality, multi-scenario text editing benchmark for generation models.
☆20Mar 16, 2026Updated 4 months ago
MiroMindAI / MiroTrain
View on GitHub
MiroTrain is an efficient and algorithm-first framework research agent.
☆142Aug 27, 2025Updated 10 months ago
YXB-NKU / SE-GUI
View on GitHub
[NeurIPS 2025]"Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning"
☆107Oct 21, 2025Updated 9 months ago
THUDM / MobileRL
View on GitHub
☆93Dec 23, 2025Updated 6 months ago
njucckevin / OpenMobile-Code
View on GitHub
The model, data and code for OpenMobile
☆49Jul 9, 2026Updated last week
facebookresearch / Geo-metric
View on GitHub
Geo-metric A Perceptual Dataset of Distortions on Faces" by Wolski et al., SIGGRAPH Asia 2022.
☆24Nov 9, 2022Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
OS-Copilot / OS-Sentinel
View on GitHub
[ACL 2026] Code, benchmark and environment for "OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic…
☆49Jul 5, 2026Updated 2 weeks ago
xlang-ai / OSWorld-G
View on GitHub
[NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesis
☆172Jun 18, 2026Updated last month
likaixin2000 / ScreenSpot-Pro-GUI-Grounding
View on GitHub
GUI Grounding for Professional High-Resolution Computer Use
☆383Jun 17, 2026Updated last month
zhangmiaosen2000 / Phi-Ground
View on GitHub
Home page for Microsoft Phi-Ground tech-report
☆22Sep 8, 2025Updated 10 months ago
xlang-ai / OSWorld-V2
View on GitHub
OSWorld 2.0: Benchmarking Computer Use Agents on Long-Horizon Real-World Tasks
☆196Jul 9, 2026Updated last week
Computer-use-agents / dart-gui
View on GitHub
DART-GUI: Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation
☆94Feb 26, 2026Updated 4 months ago
OpenGVLab / V2PE
View on GitHub
[ICCV2025] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding
☆60Apr 4, 2026Updated 3 months ago