Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent with a hierarchical manner across multiple platforms, including Windows, Linux, macOS, iOS, Android and Web.
☆111Sep 8, 2025Updated 9 months ago
Alternatives and similar repositories for MMBench-GUI
Users that are interested in MMBench-GUI are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ZeroGUI: Automating Online GUI Learning at Zero Human Cost☆120Jul 17, 2025Updated 10 months ago
- ☆22May 3, 2025Updated last year
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆54Jun 12, 2025Updated 11 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆186Oct 8, 2025Updated 8 months ago
- VeriWeb: Verifiable Long-Chain Web Benchmark for Agentic Information-Seeking☆89Jan 21, 2026Updated 4 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆32Jul 3, 2025Updated 11 months ago
- [NeurIPS 2025] UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents☆58Nov 27, 2025Updated 6 months ago
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆279Aug 12, 2025Updated 9 months ago
- The model, data and code for the visual GUI Agent SeeClick☆483Jul 13, 2025Updated 10 months ago
- [NeurIPS 2025]"Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning"☆104Oct 21, 2025Updated 7 months ago
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction☆392Mar 7, 2025Updated last year
- Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning☆15Jun 28, 2025Updated 11 months ago
- Responsible Robotic Manipulation☆15Aug 31, 2025Updated 9 months ago
- [ICCV2025] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding☆60Apr 4, 2026Updated 2 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions [TMLR2025]☆33Jan 13, 2026Updated 4 months ago
- MiroTrain is an efficient and algorithm-first framework research agent.☆141Aug 27, 2025Updated 9 months ago
- 2022 秋季学期清华大学电子系数据与算法课程 OJ 参考解答☆10Jun 18, 2023Updated 2 years ago
- ☆89Dec 23, 2025Updated 5 months ago
- Learning 1D Causal Visual Representation with De-focus Attention Networks☆35Jun 7, 2024Updated 2 years ago
- ☆18Mar 2, 2026Updated 3 months ago
- [ICLR 2025 Spotlight] OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text☆423May 5, 2025Updated last year
- XL-VLMs: General Repository for eXplainable Large Vision Language Models☆49Sep 8, 2025Updated 9 months ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆159May 29, 2025Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- An implementation of tone enhancement. May refer to "Two-scale Tone Management for Photographic Look", SIGGRAPH 2006.☆13Mar 30, 2017Updated 9 years ago
- Aligning Agentic World Models via Knowledgeable Experience Learning☆35May 15, 2026Updated 3 weeks ago
- The first comprehensive multimodal language analysis benchmark for evaluating foundation models☆31Sep 22, 2025Updated 8 months ago
- Resa: Transparent Reasoning Models via SAEs☆49Sep 23, 2025Updated 8 months ago
- Official Implementation for *PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling*☆40Dec 13, 2025Updated 5 months ago
- [AAAI 2026]Release of code, datasets and model for our work TongUI: Internet-Scale Trajectories from Multimodal Web Tutorials for General…☆111Dec 1, 2025Updated 6 months ago
- GUI Grounding for Professional High-Resolution Computer Use☆374Apr 14, 2026Updated last month
- MobileUse: an open-source mobile GUI agent for Android phone automation, AndroidWorld/AndroidLab evaluation, hierarchical reflection, and…☆154May 7, 2026Updated last month
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆125Nov 25, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation☆64Jul 11, 2025Updated 10 months ago
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"☆33Jul 25, 2025Updated 10 months ago
- Code for Research Project TLDR☆25Jul 28, 2025Updated 10 months ago
- ☆33Oct 2, 2024Updated last year
- An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.☆163Apr 6, 2026Updated 2 months ago
- ☆30Sep 24, 2025Updated 8 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Jun 28, 2024Updated last year