MinorJerry / WebVoyagerLinks
Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"
β894Updated last year
Alternatives and similar repositories for WebVoyager
Users that are interested in WebVoyager are comparing it to the libraries listed below
Sorting:
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β1,107Updated this week
- ππͺ BrowserGym, a Gym environment for web task automationβ861Updated 3 weeks ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β773Updated 6 months ago
- Agent driven automation starting with the web. Try it: https://www.emergence.ai/web-automation-apiβ1,163Updated 2 months ago
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β863Updated 4 months ago
- This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.β425Updated 2 months ago
- Windows Agent Arena (WAA) πͺ is a scalable OS platform for testing and benchmarking of multi-modal AI agents.β750Updated 3 months ago
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reβ¦β392Updated this week
- An agent benchmark with tasks in a simulated software company.β534Updated this week
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ369Updated 4 months ago
- Out-of-the-box (OOTB) GUI Agent for Windows and macOSβ1,666Updated 3 months ago
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environmentsβ2,096Updated this week
- agent q - oss advanced reasoning and learning for autonomous ai agentsβ483Updated 11 months ago
- VisualWebArena is a benchmark for multimodal agents.β368Updated 9 months ago
- Agentlessπ±: an agentless approach to automatically solve software development problemsβ1,868Updated 8 months ago
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interactionβ352Updated 5 months ago
- [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.β1,435Updated 2 months ago
- Autonomous Agents (LLMs) research papers. Updated Daily.β967Updated last week
- Official implement of paper "AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation" [EMNLP 24']β475Updated 7 months ago
- Code and Data for Tau-Benchβ791Updated last month
- xLAM: A Family of Large Action Models to Empower AI Agent Systemsβ537Updated this week
- Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs)β¦β1,365Updated 5 months ago
- π¦οΈ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/β367Updated last month
- AWM: Agent Workflow Memoryβ303Updated 6 months ago
- Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"β357Updated last week
- Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhanβ¦β1,343Updated last year
- AndroidWorld is an environment and benchmark for autonomous agentsβ387Updated this week
- [ICLR 2025] Automated Design of Agentic Systemsβ1,402Updated 7 months ago
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agentsβ269Updated last month
- β624Updated 7 months ago