[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
β2,608Feb 28, 2026Updated this week
Alternatives and similar repositories for OSWorld
Users that are interested in OSWorld are comparing it to the libraries listed below
Sorting:
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β1,353Nov 26, 2025Updated 3 months ago
- Windows Agent Arena (WAA) πͺ is a scalable OS platform for testing and benchmarking of multi-modal AI agents.β826Feb 11, 2026Updated 3 weeks ago
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ437Apr 20, 2025Updated 10 months ago
- VisualWebArena is a benchmark for multimodal agents.β440Nov 9, 2024Updated last year
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interactionβ381Mar 7, 2025Updated 11 months ago
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agentsβ300Jul 18, 2025Updated 7 months ago
- [COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wildβ4,716Nov 18, 2024Updated last year
- The model, data and code for the visual GUI Agent SeeClickβ469Jul 13, 2025Updated 7 months ago
- AndroidWorld is an environment and benchmark for autonomous agentsβ640Feb 24, 2026Updated last week
- Pioneering Automated GUI Interaction with Native Agentsβ9,712Jan 27, 2026Updated last month
- An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.β1,753Sep 9, 2024Updated last year
- [NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesisβ150Nov 6, 2025Updated 3 months ago
- ππͺ BrowserGym, a Gym environment for web task automationβ1,140Feb 10, 2026Updated 3 weeks ago
- SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecβ¦β18,581Updated this week
- Agent S: an open agentic framework that uses computers like a humanβ9,912Feb 21, 2026Updated last week
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β826Feb 3, 2025Updated last year
- The first open-source Artificial Narrow Intelligence generalist agentic framework Computer-Using-Agent that fully operates graphical-userβ¦β1,329Feb 13, 2025Updated last year
- AIOS: AI Agent Operating Systemβ5,249Jan 22, 2026Updated last month
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reβ¦β525Feb 10, 2026Updated 3 weeks ago
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β952Nov 5, 2025Updated 4 months ago
- Building a comprehensive and handy list of papers for GUI agentsβ641Oct 27, 2025Updated 4 months ago
- Out-of-the-box (OOTB) GUI Agent for Windows and macOSβ1,898May 21, 2025Updated 9 months ago
- A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)β3,187Feb 8, 2026Updated 3 weeks ago
- Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.β389Feb 22, 2025Updated last year
- Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.β21,340Feb 24, 2026Updated last week
- A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 37.3% tasks (pass@1) in SWE-beβ¦β3,058Apr 24, 2025Updated 10 months ago
- π» A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.β1,125Aug 17, 2025Updated 6 months ago
- This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.β500Nov 7, 2025Updated 3 months ago
- Large Action Model framework to develop AI Web Agentsβ6,311Jan 21, 2025Updated last year
- [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.β1,724Jan 20, 2026Updated last month
- A framework for Claude Opus to intelligently orchestrate subagents.β4,324Jul 1, 2024Updated last year
- Mobile-Agent: The Powerful GUI Agent Familyβ7,971Updated this week
- AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.β6,559Mar 19, 2025Updated 11 months ago
- [ICLR 2026] Computer Agent Arena: Toward Human-Centric Evaluation and Analysis of Computer-Use Agentsβ58Feb 26, 2026Updated last week
- Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"β1,033Mar 4, 2024Updated 2 years ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]β148Nov 26, 2024Updated last year
- GUI Grounding for Professional High-Resolution Computer Useβ333Feb 20, 2026Updated last week
- [ICLR 2025] A trinity of environments, tools, and benchmarks for general virtual agentsβ228Jun 16, 2025Updated 8 months ago
- π OpenHands: AI-Driven Developmentβ68,459Updated this week