MinorJerry / WebVoyagerLinks
Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"
β991Updated last year
Alternatives and similar repositories for WebVoyager
Users that are interested in WebVoyager are comparing it to the libraries listed below
Sorting:
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β1,279Updated last month
- Windows Agent Arena (WAA) πͺ is a scalable OS platform for testing and benchmarking of multi-modal AI agents.β807Updated 8 months ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β811Updated 11 months ago
- ππͺ BrowserGym, a Gym environment for web task automationβ1,064Updated 3 weeks ago
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β929Updated 2 months ago
- This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.β475Updated 2 months ago
- Agent driven automation starting with the web. Try it: https://www.emergence.ai/web-automation-apiβ1,198Updated 7 months ago
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reβ¦β488Updated 3 weeks ago
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ427Updated 8 months ago
- VisualWebArena is a benchmark for multimodal agents.β420Updated last year
- An agent benchmark with tasks in a simulated software company.β619Updated last month
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UIβ1,065Updated last year
- Out-of-the-box (OOTB) GUI Agent for Windows and macOSβ1,854Updated 7 months ago
- β641Updated last month
- An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.β1,722Updated last year
- Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhanβ¦β1,521Updated last year
- AWM: Agent Workflow Memoryβ376Updated 2 weeks ago
- Code and Data for Tau-Benchβ1,037Updated 4 months ago
- [NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environmentsβ2,441Updated this week
- agent q - oss advanced reasoning and learning for autonomous ai agentsβ500Updated last year
- π¦οΈ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/β388Updated last week
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interactionβ376Updated 10 months ago
- xLAM: A Family of Large Action Models to Empower AI Agent Systemsβ596Updated 4 months ago
- Agentlessπ±: an agentless approach to automatically solve software development problemsβ1,994Updated last year
- Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"β371Updated 2 months ago
- [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.β1,636Updated 7 months ago
- Examples of using E2Bβ1,240Updated 3 weeks ago
- ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)β551Updated last year
- π©ββοΈ Agent-as-a-Judge: The Magic for Open-Endednessβ703Updated 7 months ago
- Official implement of paper "AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation" [EMNLP 24']β484Updated last year