WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
β234Feb 17, 2026Updated last week
Alternatives and similar repositories for WorkArena
Users that are interested in WorkArena are comparing it to the libraries listed below
Sorting:
- ππͺ BrowserGym, a Gym environment for web task automationβ1,136Feb 10, 2026Updated 2 weeks ago
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reβ¦β522Feb 10, 2026Updated 2 weeks ago
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycleβ302Dec 16, 2025Updated 2 months ago
- VisualWebArena is a benchmark for multimodal agents.β436Nov 9, 2024Updated last year
- WebLINX is a benchmark for building web navigation agents with conversational capabilitiesβ160Feb 11, 2025Updated last year
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agentsβ298Jul 18, 2025Updated 7 months ago
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β1,337Nov 26, 2025Updated 3 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.β368Feb 19, 2026Updated last week
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agentsβ48Feb 27, 2025Updated last year
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β947Nov 5, 2025Updated 3 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]β148Nov 26, 2024Updated last year
- Web-grounded natural language instructionsβ18Nov 25, 2024Updated last year
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RLβ510Jun 6, 2025Updated 8 months ago
- β41Jul 21, 2024Updated last year
- Multimodal computer agent data collection programβ162Dec 5, 2025Updated 2 months ago
- Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"β1,028Mar 4, 2024Updated last year
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β824Feb 3, 2025Updated last year
- The model, data and code for the visual GUI Agent SeeClickβ467Jul 13, 2025Updated 7 months ago
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environmentsβ2,587Feb 20, 2026Updated last week
- β14May 9, 2024Updated last year
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ289Updated this week
- MiniWoB++: a web interaction benchmark for reinforcement learningβ371May 5, 2025Updated 9 months ago
- Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.β387Feb 22, 2025Updated last year
- β20Apr 24, 2024Updated last year
- An agent benchmark with tasks in a simulated software company.β643Nov 17, 2025Updated 3 months ago
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Controlβ67Jan 7, 2026Updated last month
- GUICourse: From General Vision Langauge Models to Versatile GUI Agentsβ136Jul 17, 2024Updated last year
- Towards Large Multimodal Models as Visual Foundation Agentsβ256Apr 24, 2025Updated 10 months ago
- Code repo for the paper: Attacking Vision-Language Computer Agents via Pop-upsβ50Dec 23, 2024Updated last year
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"β70Dec 9, 2024Updated last year
- β34Mar 6, 2025Updated 11 months ago
- Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiβ¦β728Sep 11, 2025Updated 5 months ago
- [NeurIPS 2024 D&B] VideoGUI: A Benchmark for GUI Automation from Instructional Videosβ49Updated this week
- Code for the paper π³ Tree Search for Language Model Agentsβ220Jul 25, 2024Updated last year
- [ICLR 2025] A trinity of environments, tools, and benchmarks for general virtual agentsβ228Jun 16, 2025Updated 8 months ago
- Building a comprehensive and handy list of papers for GUI agentsβ636Oct 27, 2025Updated 4 months ago
- Code and Data for Tau-Benchβ1,103Aug 28, 2025Updated 6 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)β159Oct 30, 2024Updated last year
- β16Apr 9, 2021Updated 4 years ago