karthikv792 / LLMs-PlanningLinks
An extensible benchmark for evaluating large language models on planning
☆375Updated last month
Alternatives and similar repositories for LLMs-Planning
Users that are interested in LLMs-Planning are comparing it to the libraries listed below
Sorting:
- Reasoning with Language Model is Planning with World Model☆167Updated last year
- ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.☆264Updated 2 weeks ago
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]☆320Updated last year
- VisualWebArena is a benchmark for multimodal agents.☆347Updated 6 months ago
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆314Updated 9 months ago
- Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)☆205Updated 2 years ago
- RewardBench: the first evaluation tool for reward models.☆590Updated this week
- Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agen…☆278Updated last year
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback☆108Updated 2 months ago
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆206Updated 3 weeks ago
- A banchmark list for evaluation of large language models.☆119Updated last month
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆177Updated last month
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆136Updated 6 months ago
- Must-read Papers on Large Language Model (LLM) Planning.☆417Updated 10 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆159Updated last week
- ALFWorld: Aligning Text and Embodied Environments for Interactive Learning☆466Updated 4 months ago
- [NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents☆351Updated 8 months ago
- (ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training☆272Updated last year
- ☆135Updated 5 months ago
- Paper collection on building and evaluating language model agents via executable language grounding☆355Updated last year
- Data and Code for Program of Thoughts (TMLR 2023)☆274Updated last year
- Code for the paper 🌳 Tree Search for Language Model Agents☆199Updated 10 months ago
- ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings - NeurIPS 2023 (oral)☆262Updated last year
- ☆276Updated 4 months ago
- [NeurIPS 2024] Agent Planning with World Knowledge Model☆136Updated 5 months ago
- SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks☆309Updated 7 months ago
- A curated collection of LLM reasoning and planning resources, including key papers, limitations, benchmarks, and additional learning mate…☆276Updated 3 months ago
- Towards Large Multimodal Models as Visual Foundation Agents☆216Updated last month
- Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)☆223Updated this week
- ☆173Updated 2 months ago