princeton-nlp / WebShopLinks
[NeurIPS 2022] πWebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
β351Updated 8 months ago
Alternatives and similar repositories for WebShop
Users that are interested in WebShop are comparing it to the libraries listed below
Sorting:
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]β318Updated last year
- VisualWebArena is a benchmark for multimodal agents.β347Updated 6 months ago
- SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasksβ309Updated 7 months ago
- π Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Papβ¦β206Updated 2 weeks ago
- ICML 2024: Improving Factuality and Reasoning in Language Models through Multiagent Debateβ437Updated last month
- An extensible benchmark for evaluating large language models on planningβ372Updated last month
- Code for the paper π³ Tree Search for Language Model Agentsβ199Updated 10 months ago
- Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)β205Updated 2 years ago
- Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)β223Updated this week
- FireAct: Toward Language Agent Fine-tuningβ278Updated last year
- Reasoning with Language Model is Planning with World Modelβ166Updated last year
- Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedbackβ206Updated 2 years ago
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedbackβ108Updated 2 months ago
- RewardBench: the first evaluation tool for reward models.β582Updated this week
- Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)β237Updated 10 months ago
- papers related to LLM-agent that published on top conferencesβ315Updated last month
- ALFWorld: Aligning Text and Embodied Environments for Interactive Learningβ466Updated 4 months ago
- ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels β¦β263Updated last year
- β181Updated 4 months ago
- AWM: Agent Workflow Memoryβ271Updated 4 months ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"β174Updated last month
- Data and Code for Program of Thoughts (TMLR 2023)β274Updated last year
- β329Updated 3 months ago
- This is the repo for the paper Shepherd -- A Critic for Language Model Generationβ218Updated last year
- Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Languβ¦β346Updated last year
- Paper collection on building and evaluating language model agents via executable language groundingβ355Updated last year
- [NAACL 2025] KnowAgent: Knowledge-Augmented Planning for LLM-Based Agentsβ223Updated 4 months ago
- ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings - NeurIPS 2023 (oral)β262Updated last year
- [ICLR 2024] Lemur: Open Foundation Models for Language Agentsβ548Updated last year
- Simple next-token-prediction for RLHFβ226Updated last year