[NeurIPS 2022] πWebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
β507Sep 6, 2024Updated last year
Alternatives and similar repositories for WebShop
Users that are interested in WebShop are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β963Nov 5, 2025Updated 4 months ago
- ALFWorld: Aligning Text and Embodied Environments for Interactive Learningβ688Feb 8, 2026Updated last month
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"β1,400Nov 26, 2025Updated 3 months ago
- MiniWoB++: a web interaction benchmark for reinforcement learningβ375May 5, 2025Updated 10 months ago
- VisualWebArena is a benchmark for multimodal agents.β450Nov 9, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting β’ AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Code for the paper "LASER: LLM Agent with State-Space Exploration for Web Navigation"β35Sep 26, 2023Updated 2 years ago
- A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)β3,253Feb 8, 2026Updated last month
- Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)β256Jul 16, 2024Updated last year
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898β243May 5, 2024Updated last year
- WebLINX is a benchmark for building web navigation agents with conversational capabilitiesβ161Feb 11, 2025Updated last year
- [ICLR 2023] ReAct: Synergizing Reasoning and Acting in Language Modelsβ3,672Feb 6, 2024Updated 2 years ago
- ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.β343Dec 3, 2025Updated 3 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]β148Nov 26, 2024Updated last year
- [ICML 2024] Official repository for "Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models"β820Jul 30, 2024Updated last year
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]β401May 20, 2024Updated last year
- A codebase for "Language Models can Solve Computer Tasks"β240May 1, 2024Updated last year
- Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.β392Feb 22, 2025Updated last year
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"β202Apr 17, 2025Updated 11 months ago
- β118Apr 8, 2025Updated 11 months ago
- FireAct: Toward Language Agent Fine-tuningβ292Oct 22, 2023Updated 2 years ago
- AgentTuning: Enabling Generalized Agent Abilities for LLMsβ1,484Oct 31, 2023Updated 2 years ago
- A Universal Platform for Training and Evaluation of Mobile Interactionβ61Sep 24, 2025Updated 6 months ago
- Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiβ¦β749Sep 11, 2025Updated 6 months ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Workflow-Guided Exploration: sample-efficient RL agent for web tasksβ118Jun 5, 2023Updated 2 years ago
- π AppWorld: A Controllable World of Apps and People for Benchmarking Function Calling and Interactive Coding Agent, ACL'24 Best Resourceβ¦β392Feb 17, 2026Updated last month
- GUICourse: From General Vision Langauge Models to Versatile GUI Agentsβ137Mar 1, 2026Updated 3 weeks ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agentsβ52Feb 27, 2025Updated last year
- Towards Large Multimodal Models as Visual Foundation Agentsβ259Apr 24, 2025Updated 11 months ago
- β15Mar 26, 2024Updated 2 years ago
- [NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learningβ3,100Jan 14, 2025Updated last year
- Code for the paper π³ Tree Search for Language Model Agentsβ221Jul 25, 2024Updated last year
- β208Dec 20, 2024Updated last year
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- β111Jul 2, 2024Updated last year
- An extensible benchmark for evaluating large language models on planningβ457Sep 17, 2025Updated 6 months ago
- List of language agents based on paper "Cognitive Architectures for Language Agents"β1,192Jan 16, 2025Updated last year
- [ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.β5,559May 21, 2025Updated 10 months ago
- ππͺ BrowserGym, a Gym environment for web task automationβ1,170Mar 17, 2026Updated last week
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RLβ513Jun 6, 2025Updated 9 months ago
- [ICCV'23] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Modelsβ218Mar 26, 2025Updated last year