WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
β242Feb 23, 2026Updated last month
Alternatives and similar repositories for WorkArena
Users that are interested in WorkArena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ππͺ BrowserGym, a Gym environment for web task automationβ1,190Mar 17, 2026Updated 3 weeks ago
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reβ¦β557Mar 17, 2026Updated 3 weeks ago
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycleβ309Dec 16, 2025Updated 3 months ago
- DoomArena is a Framework for Testing AI Agents Against Evolving Security Threatsβ56Sep 12, 2025Updated 6 months ago
- VisualWebArena is a benchmark for multimodal agents.β454Nov 9, 2024Updated last year
- NordVPN Special Discount Offer β’ AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Helping AI practitioners better understand their datasets and models in text classification. From ServiceNow.β72Dec 23, 2024Updated last year
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.β393Apr 2, 2026Updated last week
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β1,418Nov 26, 2025Updated 4 months ago
- WebLINX is a benchmark for building web navigation agents with conversational capabilitiesβ160Feb 11, 2025Updated last year
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agentsβ305Mar 11, 2026Updated 3 weeks ago
- β41Jul 21, 2024Updated last year
- Accelerating your LLM training to full speed! Made with β€οΈ by ServiceNow Researchβ302Updated this week
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β972Nov 5, 2025Updated 5 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]β149Nov 26, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways β’ AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- β10Nov 8, 2020Updated 5 years ago
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RLβ515Jun 6, 2025Updated 10 months ago
- β28May 26, 2021Updated 4 years ago
- An AI-powered literature review assistant for researchersβ28Apr 18, 2025Updated 11 months ago
- Setup scripts for the WebArena benchmarkβ20Jun 19, 2025Updated 9 months ago
- β16May 6, 2025Updated 11 months ago
- Web-grounded natural language instructionsβ18Nov 25, 2024Updated last year
- Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"β1,062Mar 4, 2024Updated 2 years ago
- To Run, Manage and Visualize Large Scale Experimentsβ172Sep 24, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Towards Large Multimodal Models as Visual Foundation Agentsβ263Apr 24, 2025Updated 11 months ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β842Feb 3, 2025Updated last year
- MiniWoB++: a web interaction benchmark for reinforcement learningβ377May 5, 2025Updated 11 months ago
- Code and Data for Tau-Benchβ1,165Mar 18, 2026Updated 3 weeks ago
- The model, data and code for the visual GUI Agent SeeClickβ477Jul 13, 2025Updated 8 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)β161Oct 30, 2024Updated last year
- β16Apr 9, 2021Updated 4 years ago
- Code for the paper π³ Tree Search for Language Model Agentsβ221Jul 25, 2024Updated last year
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environmentsβ2,733Apr 2, 2026Updated last week
- Wordpress hosting with auto-scaling on Cloudways β’ AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Controlβ68Jan 7, 2026Updated 3 months ago
- Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations is a ServiceNow Research project that was started at Elemenβ¦β13Jul 31, 2023Updated 2 years ago
- An agent benchmark with tasks in a simulated software company.β671Nov 17, 2025Updated 4 months ago
- β59Jan 28, 2025Updated last year
- β35May 16, 2025Updated 10 months ago
- Interaction-first method for generating demonstrations for web-agents on any websiteβ55Apr 29, 2025Updated 11 months ago
- An enterprise deep research benchmarkβ35Mar 22, 2026Updated 2 weeks ago