☆30Jul 3, 2025Updated 7 months ago
Alternatives and similar repositories for UI-Vision
Users that are interested in UI-Vision are comparing it to the libraries listed below
Sorting:
- OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents☆21Jan 6, 2026Updated last month
- ☆13Nov 5, 2024Updated last year
- Repository of GUI Action Narrator☆12Apr 8, 2025Updated 10 months ago
- GroundCUA☆68Dec 24, 2025Updated 2 months ago
- [NeurIPS 2025] A multimodal agent that can interact with its own PC in a multimodal manner.☆34Nov 10, 2025Updated 3 months ago
- ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World☆24Jun 17, 2025Updated 8 months ago
- ☆25Jan 28, 2026Updated last month
- ☆118Apr 8, 2025Updated 10 months ago
- Benchmark of complex, multimodal desktop-oriented tasks for advanced GUI-navigation AI agents☆24May 7, 2025Updated 9 months ago
- [NeurIPS 2024 D&B] VideoGUI: A Benchmark for GUI Automation from Instructional Videos☆49Feb 22, 2026Updated last week
- [AAAI 2026]Release of code, datasets and model for our work TongUI: Internet-Scale Trajectories from Multimodal Web Tutorials for General…☆73Dec 1, 2025Updated 3 months ago
- Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.☆113Jul 27, 2025Updated 7 months ago
- ☆20Apr 24, 2024Updated last year
- Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent w…☆100Sep 8, 2025Updated 5 months ago
- 《MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation》☆134Feb 2, 2026Updated 3 weeks ago
- R1-like Computer-use Agent☆89Mar 21, 2025Updated 11 months ago
- This project is a Token Sale dApp that allows one to buy tokens and also displays recently minted tokens on the Solana blockchain using t…☆11Jul 30, 2024Updated last year
- Your command-line, context-aware chatbot for instant codebase insights & more ✨☆16May 30, 2024Updated last year
- ☆13Nov 21, 2025Updated 3 months ago
- 无需付费的控制手机并完成你指令的Agent!让Agent帮你操控手机点外卖,打网约车,给微信或者QQ等好友发消息.......解放你的双手!并且无需调用GPT-4o等付费api!☆71Jun 20, 2025Updated 8 months ago
- Middleware and macros/ui extensions to control smart buildings with Webex devices☆20Jul 31, 2025Updated 7 months ago
- ☆12Oct 17, 2025Updated 4 months ago
- Kernel Playground - A playground to run large scale experiments on the Linux Kernel☆17Nov 8, 2025Updated 3 months ago
- GUIPilot: A Consistency-based Mobile GUI Testing Approach for Detecting Application-specific Bugs☆14Jan 5, 2026Updated last month
- Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-ups☆51Dec 23, 2024Updated last year
- Digital advertising is becoming increasingly important. At the same time, however, the problems of this type of marketing are becoming mo…☆11Oct 4, 2022Updated 3 years ago
- Repository for the D ONE MLOps AWS BlogPost☆11Aug 13, 2024Updated last year
- An autonomous service implementing a decentralized Impact Evaluator☆13Dec 1, 2025Updated 3 months ago
- List of subdomain and dynamic DNS providers☆10Dec 11, 2025Updated 2 months ago
- Kernel CLI☆13Updated this week
- An isolated environment for DNS cache poisoning attack investigation and demonstration.☆10Nov 22, 2020Updated 5 years ago
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆43Dec 9, 2024Updated last year
- [NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesis☆150Nov 6, 2025Updated 3 months ago
- [CVPR 2025] GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration☆20Mar 21, 2025Updated 11 months ago
- ARCHIVED☆11Jul 11, 2025Updated 7 months ago
- ☆11Jan 28, 2025Updated last year
- LLM as World Models using Bayesian inference☆16May 27, 2025Updated 9 months ago
- POSIX-compliant CLI tool that wraps tmux ≥ 3.0 and git worktree to manage parallel AI coding sessions ("heads")☆27Dec 30, 2025Updated 2 months ago
- ☆14Feb 12, 2024Updated 2 years ago