Andrewzh112 / ExpeLLinks
☆14Updated 2 years ago
Alternatives and similar repositories for ExpeL
Users that are interested in ExpeL are comparing it to the libraries listed below
Sorting:
- ☆86Updated 2 years ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆147Updated last year
- augmented LLM with self reflection☆135Updated 2 years ago
- 🌟 SwarmAgent: A framework for simulating social group dynamics using multi-agent collaboration, aiding insights into collective behavior…☆12Updated 2 years ago
- An Analytical Evaluation Board of Multi-turn LLM Agents [NeurIPS 2024 Oral]☆380Updated last year
- Code for the paper 🌳 Tree Search for Language Model Agents☆217Updated last year
- SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks☆323Updated last year
- Reasoning with Language Model is Planning with World Model☆184Updated 2 years ago
- A set of utilities for running few-shot prompting experiments on large-language models☆126Updated 2 years ago
- WebLINX is a benchmark for building web navigation agents with conversational capabilities☆156Updated 10 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆113Updated 5 months ago
- ☆186Updated 11 months ago
- ☆122Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆114Updated 3 weeks ago
- ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.☆324Updated 3 weeks ago
- [TMLR'25] "Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents"☆92Updated 2 months ago
- Official code for the paper "ADaPT: As-Needed Decomposition and Planning with Language Models"☆90Updated last year
- [NeurIPS '23 Spotlight] Thought Cloning: Learning to Think while Acting by Imitating Human Thinking☆268Updated last year
- This repository contains a LLM benchmark for the social deduction game `Resistance Avalon'☆132Updated 7 months ago
- Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction A…☆47Updated last year
- ☆144Updated last year
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆254Updated 7 months ago
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback☆123Updated 9 months ago
- ToolBench, an evaluation suite for LLM tool manipulation capabilities.☆168Updated last year
- ☆129Updated last year
- A banchmark list for evaluation of large language models.☆153Updated 3 months ago
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898☆234Updated last year
- Code for Arxiv 2023: Improving Language Model Negociation with Self-Play and In-Context Learning from AI Feedback☆208Updated 2 years ago
- DialOp: Decision-oriented dialogue environments for collaborative language agents☆111Updated last year