WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
β252Apr 25, 2026Updated last month
Alternatives and similar repositories for WorkArena
Users that are interested in WorkArena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ππͺ BrowserGym, a Gym environment for web task automationβ1,241Mar 17, 2026Updated 2 months ago
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reβ¦β585Mar 17, 2026Updated 2 months ago
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycleβ319Dec 16, 2025Updated 5 months ago
- DoomArena is a Framework for Testing AI Agents Against Evolving Security Threatsβ61Sep 12, 2025Updated 8 months ago
- VisualWebArena is a benchmark for multimodal agents.β477Nov 9, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.β415Updated this week
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β1,499Nov 26, 2025Updated 6 months ago
- WebLINX is a benchmark for building web navigation agents with conversational capabilitiesβ160Feb 11, 2025Updated last year
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agentsβ317Mar 11, 2026Updated 2 months ago
- β41Jul 21, 2024Updated last year
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ314Updated this week
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β999Nov 5, 2025Updated 7 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]β149Nov 26, 2024Updated last year
- β10Nov 8, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RLβ526Jun 6, 2025Updated last year
- β28May 26, 2021Updated 5 years ago
- Setup scripts for the WebArena benchmarkβ22Jun 19, 2025Updated 11 months ago
- β16May 6, 2025Updated last year
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agentsβ54Feb 27, 2025Updated last year
- An AI-powered literature review assistant for researchersβ36May 7, 2026Updated last month
- Multimodal computer agent data collection programβ170Dec 5, 2025Updated 6 months ago
- Web-grounded natural language instructionsβ18Nov 25, 2024Updated last year
- Demos for the MiniWoB++ benchmarkβ21Feb 23, 2018Updated 8 years ago
- Open source password manager - Proton Pass β’ AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"β1,093Mar 4, 2024Updated 2 years ago
- To Run, Manage and Visualize Large Scale Experimentsβ173Sep 24, 2023Updated 2 years ago
- Towards Large Multimodal Models as Visual Foundation Agentsβ267Apr 24, 2025Updated last year
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β845Feb 3, 2025Updated last year
- MiniWoB++: a web interaction benchmark for reinforcement learningβ384May 27, 2026Updated last week
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environmentsβ2,911Updated this week
- The model, data and code for the visual GUI Agent SeeClickβ483Jul 13, 2025Updated 10 months ago
- Code and Data for Tau-Benchβ1,255Mar 18, 2026Updated 2 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)β164Oct 30, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- β16Apr 9, 2021Updated 5 years ago
- Code for the paper π³ Tree Search for Language Model Agentsβ222Jul 25, 2024Updated last year
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Controlβ69Jan 7, 2026Updated 5 months ago
- Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations is a ServiceNow Research project that was started at Elemenβ¦β13Jul 31, 2023Updated 2 years ago
- β59Jan 28, 2025Updated last year
- An agent benchmark with tasks in a simulated software company.β717Nov 17, 2025Updated 6 months ago
- β35May 16, 2025Updated last year