microsoft / WindowsAgentArenaLinks
Windows Agent Arena (WAA) πͺ is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
β718Updated last month
Alternatives and similar repositories for WindowsAgentArena
Users that are interested in WindowsAgentArena are comparing it to the libraries listed below
Sorting:
- This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.β381Updated 3 weeks ago
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ351Updated 2 months ago
- ππͺ BrowserGym, a Gym environment for web task automationβ776Updated last week
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reβ¦β348Updated this week
- Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"β827Updated last year
- agent q - oss advanced reasoning and learning for autonomous ai agentsβ471Updated 8 months ago
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interactionβ318Updated 3 months ago
- GUI Grounding for Professional High-Resolution Computer Useβ213Updated last month
- π¦οΈ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/β347Updated last week
- xLAM: A Family of Large Action Models to Empower AI Agent Systemsβ465Updated last week
- βοΈ The First Coding Agent-as-a-Judgeβ558Updated last month
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β1,028Updated 4 months ago
- VisualWebArena is a benchmark for multimodal agents.β350Updated 7 months ago
- E2B Desktop Sandbox for LLMs. E2B Sandbox with desktop graphical environment that you can connect to any LLM for secure computer use.β978Updated this week
- AWM: Agent Workflow Memoryβ275Updated 4 months ago
- AndroidWorld is an environment and benchmark for autonomous agentsβ339Updated this week
- An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through inβ¦β725Updated 8 months ago
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environmentsβ1,938Updated this week
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β754Updated 4 months ago
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agentsβ254Updated 3 weeks ago
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RLβ406Updated 2 weeks ago
- π» A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.β743Updated 3 weeks ago
- An agent benchmark with tasks in a simulated software company.β397Updated last week
- [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.β1,317Updated 3 weeks ago
- Atom of Thoughts for Markov LLM Test-Time Scalingβ574Updated last week
- Autonomous Agents (LLMs) research papers. Updated Daily.β855Updated this week
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineeringβ760Updated this week
- AIDE: AI-Driven Exploration in the Space of Code. State of the Art machine Learning engineering agents that automates AI R&D.β930Updated 2 months ago
- Agentlessπ±: an agentless approach to automatically solve software development problemsβ1,743Updated 6 months ago
- GitHub page for "Large Language Model-Brained GUI Agents: A Survey"β167Updated last month