MinorJerry / WebVoyagerLinks
Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"
β847Updated last year
Alternatives and similar repositories for WebVoyager
Users that are interested in WebVoyager are comparing it to the libraries listed below
Sorting:
- ππͺ BrowserGym, a Gym environment for web task automationβ806Updated last week
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β1,048Updated 5 months ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β760Updated 5 months ago
- Windows Agent Arena (WAA) πͺ is a scalable OS platform for testing and benchmarking of multi-modal AI agents.β727Updated 2 months ago
- This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.β397Updated last month
- Agent driven automation starting with the web. Try it: https://www.emergence.ai/web-automation-apiβ1,145Updated last month
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β841Updated 3 months ago
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reβ¦β358Updated last week
- Agentlessπ±: an agentless approach to automatically solve software development problemsβ1,781Updated 6 months ago
- An agent benchmark with tasks in a simulated software company.β488Updated last week
- [ICLR 2025] Automated Design of Agentic Systemsβ1,373Updated 5 months ago
- β611Updated 6 months ago
- Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhanβ¦β1,289Updated last year
- Out-of-the-box (OOTB) GUI Agent for Windows and macOSβ1,623Updated last month
- xLAM: A Family of Large Action Models to Empower AI Agent Systemsβ482Updated 3 weeks ago
- Code and Data for Tau-Benchβ666Updated this week
- [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.β1,370Updated last month
- E2B Desktop Sandbox for LLMs. E2B Sandbox with desktop graphical environment that you can connect to any LLM for secure computer use.β1,000Updated last week
- Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs)β¦β1,323Updated 3 months ago
- AI computer use powered by open source LLMs and E2B Desktop Sandboxβ1,336Updated last month
- Open-source resources on agents for computer use.β354Updated 5 months ago
- agent q - oss advanced reasoning and learning for autonomous ai agentsβ473Updated 9 months ago
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UIβ1,045Updated 7 months ago
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ356Updated 2 months ago
- βοΈ The First Coding Agent-as-a-Judgeβ578Updated 2 months ago
- VisualWebArena is a benchmark for multimodal agents.β357Updated 8 months ago
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interactionβ332Updated 4 months ago
- Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"β351Updated 3 months ago
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environmentsβ1,968Updated this week
- Autonomous Agents (LLMs) research papers. Updated Daily.β868Updated this week