MinorJerry / WebVoyagerLinks
Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"
β970Updated last year
Alternatives and similar repositories for WebVoyager
Users that are interested in WebVoyager are comparing it to the libraries listed below
Sorting:
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β1,238Updated last week
- Windows Agent Arena (WAA) πͺ is a scalable OS platform for testing and benchmarking of multi-modal AI agents.β792Updated 7 months ago
- ππͺ BrowserGym, a Gym environment for web task automationβ1,021Updated 3 weeks ago
- Agent driven automation starting with the web. Try it: https://www.emergence.ai/web-automation-apiβ1,190Updated 6 months ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β806Updated 10 months ago
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β906Updated last month
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reβ¦β479Updated last week
- This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.β464Updated 3 weeks ago
- An agent benchmark with tasks in a simulated software company.β592Updated 2 weeks ago
- Agentlessπ±: an agentless approach to automatically solve software development problemsβ1,978Updated 11 months ago
- agent q - oss advanced reasoning and learning for autonomous ai agentsβ497Updated last year
- VisualWebArena is a benchmark for multimodal agents.β410Updated last year
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ401Updated 7 months ago
- Code and Data for Tau-Benchβ987Updated 3 months ago
- xLAM: A Family of Large Action Models to Empower AI Agent Systemsβ584Updated 3 months ago
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environmentsβ2,359Updated 2 weeks ago
- Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhanβ¦β1,469Updated last year
- β640Updated 3 weeks ago
- π¦οΈ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/β386Updated last week
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UIβ1,060Updated 11 months ago
- Out-of-the-box (OOTB) GUI Agent for Windows and macOSβ1,833Updated 6 months ago
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interactionβ371Updated 9 months ago
- π©ββοΈ Coding Agent-as-a-Judgeβ682Updated 6 months ago
- AI computer use powered by open source LLMs and E2B Desktop Sandboxβ1,662Updated 6 months ago
- An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through inβ¦β782Updated 2 months ago
- An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.β1,698Updated last year
- [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.β1,574Updated 6 months ago
- [ICLR 2025] Automated Design of Agentic Systemsβ1,471Updated 10 months ago
- Open-source resources on agents for computer use.β385Updated last month
- E2B Desktop Sandbox for LLMs. E2B Sandbox with desktop graphical environment that you can connect to any LLM for secure computer use.β1,160Updated last month