Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent with a hierarchical manner across multiple platforms, including Windows, Linux, macOS, iOS, Android and Web.
☆100Sep 8, 2025Updated 5 months ago
Alternatives and similar repositories for MMBench-GUI
Users that are interested in MMBench-GUI are comparing it to the libraries listed below
Sorting:
- ☆21May 3, 2025Updated 9 months ago
- A simple visual test-time scaling method for GUI agent grounding☆20Dec 7, 2025Updated 2 months ago
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆51Jun 12, 2025Updated 8 months ago
- VeriWeb: Verifiable Long-Chain Web Benchmark for Agentic Information-Seeking☆86Jan 21, 2026Updated last month
- ☆30Jul 3, 2025Updated 7 months ago
- [NeurIPS 2025] UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents☆53Nov 27, 2025Updated 3 months ago
- ZeroGUI: Automating Online GUI Learning at Zero Human Cost☆110Jul 17, 2025Updated 7 months ago
- Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning☆14Jun 28, 2025Updated 8 months ago
- BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions [TMLR2025]☆29Jan 13, 2026Updated last month
- An environment for mobile angets to interact with realistic android device or android emulator☆13Jul 19, 2024Updated last year
- Official implementation for our paper: Rethinking Video Tokenization: A Conditioned Diffusion-based Approach☆14Apr 2, 2025Updated 10 months ago
- Responsible Robotic Manipulation☆16Aug 31, 2025Updated 6 months ago
- Under construction☆13Jan 15, 2025Updated last year
- [AAAI 2026] ReCode: Reinforced Code Knowledge Editing for API Updates☆22Jul 1, 2025Updated 8 months ago
- Aligning Agentic World Models via Knowledgeable Experience Learning☆31Jan 25, 2026Updated last month
- [NeurIPS 2025]"Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning"☆96Oct 21, 2025Updated 4 months ago
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction☆379Mar 7, 2025Updated 11 months ago
- The model, data and code for the visual GUI Agent SeeClick☆467Jul 13, 2025Updated 7 months ago
- [ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding☆58Dec 13, 2024Updated last year
- [ACL'25 (Findings)] Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents☆26Feb 17, 2026Updated last week
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆31Updated this week
- ☆25Sep 24, 2025Updated 5 months ago
- Learning 1D Causal Visual Representation with De-focus Attention Networks☆35Jun 7, 2024Updated last year
- ☆20Jun 16, 2025Updated 8 months ago
- ☆63Dec 23, 2025Updated 2 months ago
- XL-VLMs: General Repository for eXplainable Large Vision Language Models☆46Sep 8, 2025Updated 5 months ago
- ☆22Sep 9, 2025Updated 5 months ago
- The first comprehensive multimodal language analysis benchmark for evaluating foundation models☆28Sep 22, 2025Updated 5 months ago
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆256Aug 12, 2025Updated 6 months ago
- [EMNLP 2025 Main] Official implementation of VRoPE: Rotary Position Embedding for Video Large Language Models.☆27Nov 18, 2025Updated 3 months ago
- [ACL 2025 Findings] Text2World: Benchmarking Large Language Models for Symbolic World Model Generation☆28Feb 25, 2025Updated last year
- Official Implementation for *PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling*☆32Dec 13, 2025Updated 2 months ago
- 《MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation》☆134Feb 2, 2026Updated 3 weeks ago
- Hands-On Image Processing with Python, Second Edition, Published by Packt☆26Feb 11, 2026Updated 2 weeks ago
- ☆125Oct 3, 2025Updated 4 months ago
- [ACM MM25] LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models☆23Mar 29, 2025Updated 11 months ago
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"☆33Jul 25, 2025Updated 7 months ago
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆123Nov 25, 2024Updated last year
- implementation of dualformer☆24Mar 1, 2025Updated last year