MinorJerry / WebVoyager
Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"
☆324Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for WebVoyager
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large mult…☆637Updated 2 weeks ago
- ☆521Updated last month
- Official Repo for ICML 2024 paper "Executable Code Actions Elicit Better LLM Agents" by Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhan…☆481Updated 5 months ago
- ☆310Updated last month
- Code for Husky, an open-source language agent that solves complex, multi-step reasoning tasks. Husky v1 addresses numerical, tabular and …☆327Updated 4 months ago
- Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.☆460Updated this week
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"☆737Updated last month
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web"☆704Updated 3 months ago
- Implementation of the ScreenAI model from the paper: "A Vision-Language Model for UI and Infographics Understanding"☆289Updated this week
- AWM: Agent Workflow Memory☆203Updated last month
- CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/☆187Updated this week
- Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"☆447Updated 7 months ago
- Official implement of paper "AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation"☆418Updated 4 months ago
- Autonomous Agents (LLMs) research papers. Updated Daily.☆492Updated this week
- Agent S: an open agentic framework that uses computers like a human☆556Updated this week
- Agent driven automation starting with the web. Discord: https://discord.gg/wgNfmFuqJF☆799Updated this week
- ☆280Updated 7 months ago
- Code and Data for Tau-Bench☆193Updated 2 weeks ago
- Environments, tools, and benchmarks for general computer agents☆171Updated 2 weeks ago
- VisualWebArena is a benchmark for multimodal agents.☆235Updated last month
- An example of multi-agent orchestration with llama-index☆314Updated 2 weeks ago
- Agentless🐱: an agentless approach to automatically solve software development problems☆710Updated last week
- Super performant RAG pipelines for AI apps. Summarization, Retrieve/Rerank and Code Interpreters in one simple API.☆340Updated 6 months ago
- NexusRaven-13B, a new SOTA Open-Source LLM for function calling. This repo contains everything for reproducing our evaluation on NexusRav…☆306Updated last year
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI☆978Updated last month
- Implementation of Google's SELF-DISCOVER☆281Updated 3 months ago
- BrowserGym, a gym environment for web task automation in the Chromium browser.☆316Updated this week
- A compilation of the best multi-agent papers☆247Updated last week
- Task-based Agentic Framework using StrictJSON as the core☆436Updated 3 weeks ago
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RL☆147Updated this week