MinorJerry / WebVoyagerLinks
Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"
β911Updated last year
Alternatives and similar repositories for WebVoyager
Users that are interested in WebVoyager are comparing it to the libraries listed below
Sorting:
- This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.β433Updated 3 months ago
- ππͺ BrowserGym, a Gym environment for web task automationβ891Updated this week
- Windows Agent Arena (WAA) πͺ is a scalable OS platform for testing and benchmarking of multi-modal AI agents.β764Updated 4 months ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β783Updated 7 months ago
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β1,134Updated last week
- Agent driven automation starting with the web. Try it: https://www.emergence.ai/web-automation-apiβ1,167Updated 3 months ago
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β870Updated 5 months ago
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reβ¦β404Updated this week
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ379Updated 4 months ago
- VisualWebArena is a benchmark for multimodal agents.β374Updated 10 months ago
- Out-of-the-box (OOTB) GUI Agent for Windows and macOSβ1,671Updated 3 months ago
- agent q - oss advanced reasoning and learning for autonomous ai agentsβ488Updated 11 months ago
- Agentlessπ±: an agentless approach to automatically solve software development problemsβ1,904Updated 8 months ago
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environmentsβ2,138Updated this week
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interactionβ356Updated 6 months ago
- xLAM: A Family of Large Action Models to Empower AI Agent Systemsβ553Updated 3 weeks ago
- βοΈ The First Coding Agent-as-a-Judgeβ626Updated 4 months ago
- Code and Data for Tau-Benchβ834Updated 2 weeks ago
- An agent benchmark with tasks in a simulated software company.β546Updated 3 weeks ago
- [ICLR 2025] Automated Design of Agentic Systemsβ1,413Updated 7 months ago
- π¦οΈ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/β372Updated 2 months ago
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UIβ1,051Updated 9 months ago
- [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.β1,466Updated 3 months ago
- E2B Desktop Sandbox for LLMs. E2B Sandbox with desktop graphical environment that you can connect to any LLM for secure computer use.β1,092Updated last week
- An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.β1,683Updated last year
- Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"β364Updated last week
- Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhanβ¦β1,379Updated last year
- Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs)β¦β1,374Updated 6 months ago
- Autonomous Agents (LLMs) research papers. Updated Daily.β1,009Updated this week
- Open-source resources on agents for computer use.β369Updated 7 months ago