MinorJerry / WebVoyagerLinks
Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"
β1,005Updated last year
Alternatives and similar repositories for WebVoyager
Users that are interested in WebVoyager are comparing it to the libraries listed below
Sorting:
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β1,313Updated 2 months ago
- Windows Agent Arena (WAA) πͺ is a scalable OS platform for testing and benchmarking of multi-modal AI agents.β815Updated 9 months ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β816Updated 11 months ago
- ππͺ BrowserGym, a Gym environment for web task automationβ1,090Updated last week
- This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.β482Updated 2 months ago
- Agent driven automation starting with the web. Try it: https://www.emergence.ai/web-automation-apiβ1,204Updated 7 months ago
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reβ¦β504Updated last week
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β934Updated 2 months ago
- agent q - oss advanced reasoning and learning for autonomous ai agentsβ499Updated last year
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ431Updated 9 months ago
- VisualWebArena is a benchmark for multimodal agents.β429Updated last year
- Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhanβ¦β1,564Updated last year
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interactionβ377Updated 10 months ago
- Agentlessπ±: an agentless approach to automatically solve software development problemsβ2,003Updated last year
- An agent benchmark with tasks in a simulated software company.β631Updated 2 months ago
- π¦οΈ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/β393Updated last week
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UIβ1,066Updated last year
- Out-of-the-box (OOTB) GUI Agent for Windows and macOSβ1,868Updated 8 months ago
- β641Updated 2 months ago
- Code and Data for Tau-Benchβ1,079Updated 5 months ago
- ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)β562Updated last year
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environmentsβ2,506Updated this week
- xLAM: A Family of Large Action Models to Empower AI Agent Systemsβ599Updated 5 months ago
- Autonomous Agents (LLMs) research papers. Updated Daily.β1,130Updated last week
- AI computer use powered by open source LLMs and E2B Desktop Sandboxβ1,760Updated 7 months ago
- [ICLR 2025] Automated Design of Agentic Systemsβ1,498Updated last year
- π©ββοΈ Agent-as-a-Judge: The Magic for Open-Endednessβ718Updated 8 months ago
- Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs)β¦β1,479Updated last week
- [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.β1,675Updated last week
- An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through inβ¦β796Updated 3 months ago