xlang-ai/OpenCUA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xlang-ai/OpenCUA)

xlang-ai / OpenCUA

[NeurIPS 2025 Spotlight] OpenCUA: Open Foundations for Computer-Use Agents

☆799

Alternatives and similar repositories for OpenCUA

Users that are interested in OpenCUA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xlang-ai / AgentNetTool
View on GitHub
This is the official code base of AgentNetTool in OpenCUA. Website: https://opencua.xlang.ai/
☆51Sep 3, 2025Updated 10 months ago
xlang-ai / aguvis
View on GitHub
[ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
☆389Mar 7, 2025Updated last year
xlang-ai / OSWorld
View on GitHub
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
☆3,010Jul 7, 2026Updated last week
likaixin2000 / ScreenSpot-Pro-GUI-Grounding
View on GitHub
GUI Grounding for Professional High-Resolution Computer Use
☆383Jun 17, 2026Updated 3 weeks ago
xlang-ai / OSWorld-V2
View on GitHub
OSWorld 2.0: Benchmarking Computer Use Agents on Long-Horizon Real-World Tasks
☆187Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
xlang-ai / OSWorld-G
View on GitHub
[NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesis
☆171Jun 18, 2026Updated 3 weeks ago
WukLab / osworld-human
View on GitHub
OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents
☆27May 17, 2026Updated last month
xlang-ai / computer-agent-arena
View on GitHub
[ICLR 2026] Computer Agent Arena: Toward Human-Centric Evaluation and Analysis of Computer-Use Agents
☆66Feb 26, 2026Updated 4 months ago
xlang-ai / CUA-Gym-Hub
View on GitHub
CUA-Gym-Hub: mock web apps as reproducible RL training environments for computer-use agents
☆63Updated this week
zhangmiaosen2000 / Phi-Ground
View on GitHub
Home page for Microsoft Phi-Ground tech-report
☆22Sep 8, 2025Updated 10 months ago
ranpox / awesome-computer-use
View on GitHub
This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.
☆569Apr 15, 2026Updated 3 months ago
bytedance / UI-TARS
View on GitHub
Pioneering Automated GUI Interaction with Native Agents
☆11,168Jan 27, 2026Updated 5 months ago
inclusionAI / M2-Reasoning
View on GitHub
M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning
☆47Jul 17, 2025Updated 11 months ago
McGill-NLP / agent-reward-bench
View on GitHub
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
☆47Aug 7, 2025Updated 11 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
xlang-ai / CUA-Gym
View on GitHub
Scalable pipeline for synthesizing verifiable RLVR training data for computer-use agents
☆173May 26, 2026Updated last month
microsoft / MageBench
View on GitHub
Official Repo for MageBench: Bridging Large Multimodal Models to Agents
☆22Jan 8, 2025Updated last year
OS-Copilot / OS-Genesis
View on GitHub
[ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
☆188Oct 8, 2025Updated 9 months ago
UITron-hub / UITron-Speech
View on GitHub
☆21Jan 22, 2026Updated 5 months ago
stepfun-ai / NextStep-1
View on GitHub
[🚀 ICLR 2026 Oral] NextStep-1: SOTA Autogressive Image Generation with Continuous Tokens. A research project developed by the StepFun’s …
☆689Feb 27, 2026Updated 4 months ago
ByteDance-Seed / m3-agent
View on GitHub
☆1,417Feb 12, 2026Updated 5 months ago
SunzeY / SEAgent
View on GitHub
[ICML-2026] Official implementation of "SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience"
☆256Aug 7, 2025Updated 11 months ago
shizhediao / automate-cot
View on GitHub
Source code for the paper "Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data"
☆20Feb 24, 2024Updated 2 years ago
OPPO-PersonalAI / OAgents
View on GitHub
Implementation for OAgents: An Empirical Study of Building Effective Agents
☆324Oct 13, 2025Updated 9 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
showlab / Awesome-GUI-Agent
View on GitHub
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
☆1,196Aug 17, 2025Updated 10 months ago
iLearn-Lab / CVPR26-HiconAgent
View on GitHub
[CVPR 2026] HiconAgent: History Context-aware Policy Optimization for GUI Agents
☆30Mar 9, 2026Updated 4 months ago
ServiceNow / GroundCUA
View on GitHub
GroundCUA
☆128Mar 24, 2026Updated 3 months ago
OSU-NLP-Group / GUI-Agents-Paper-List
View on GitHub
Awesome GUI Agent Paper List
☆854Jun 28, 2026Updated 2 weeks ago
X-PLUG / MobileAgent
View on GitHub
Mobile-Agent: The Powerful GUI Agent Family
☆8,932Jul 7, 2026Updated last week
UITron-hub / UItron
View on GitHub
☆67Sep 6, 2025Updated 10 months ago
simular-ai / Agent-S
View on GitHub
Agent S: an open agentic framework that uses computers like a human
☆12,015May 13, 2026Updated 2 months ago
microsoft / WindowsAgentArena
View on GitHub
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
☆879Apr 13, 2026Updated 3 months ago
njucckevin / SeeClick
View on GitHub
The model, data and code for the visual GUI Agent SeeClick
☆489Jul 13, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
pythongod-exe / iGaussian
View on GitHub
IROS
☆17Aug 10, 2025Updated 11 months ago
zai-org / CogAgent
View on GitHub
An open-sourced end-to-end VLM-based GUI Agent
☆1,187Apr 4, 2025Updated last year
StarsfieldAI / R1-V
View on GitHub
Witness the aha moment of VLM with less than $3.
☆4,065May 19, 2025Updated last year
meituan / MemOCR
View on GitHub
MemOCR: an OCR-driven visual memory agent.
☆33May 17, 2026Updated last month
mll-lab-nu / RAGEN
View on GitHub
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
☆2,747Apr 14, 2026Updated 3 months ago
yannqi / R-4B
View on GitHub
The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"
☆141Sep 4, 2025Updated 10 months ago
alibaba / UI-Ins
View on GitHub
Official implementation of UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning
☆77Apr 20, 2026Updated 2 months ago