S-Abdelnabi / LLM-DeliberationLinks

Code for our NeurIPS'24 Dataset and Benchmark paper: Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation

☆34

Alternatives and similar repositories for LLM-Deliberation

Users that are interested in LLM-Deliberation are comparing it to the libraries listed below

Sorting:

ryoungj / ToolEmu
[ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use
☆151Updated last year
ucl-dark / llm_debate
Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"
☆112Updated last year
zjunlp / MachineSoM
[ACL 2024] Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View
☆118Updated last month
jonathanmli / Avalon-LLM
This repository contains a LLM benchmark for the social deduction game `Resistance Avalon'
☆118Updated last month
abdulhaim / LMRL-Gym
☆98Updated last year
aypan17 / machiavelli
☆137Updated 8 months ago
OSU-NLP-Group / llm-planning-eval
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
☆54Updated last year
Yifan-Song793 / ETO
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)
☆146Updated 8 months ago
haotiansun14 / AdaPlanner
AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback
☆110Updated 3 months ago
Yu-Fangxu / FoR
[ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples
☆101Updated last month
giorgiopiatti / GovSim
Governance of the Commons Simulation (GovSim)
☆55Updated 5 months ago
HannahKirk / prism-alignment
The Prism Alignment Project
☆79Updated last year
Ber666 / RAP
Reasoning with Language Model is Planning with World Model
☆168Updated last year
rxlqn / awesome-llm-self-reflection
augmented LLM with self reflection
☆129Updated last year
likenneth / dialogue_action_token
Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
☆25Updated last year
allenai / ScienceWorld
ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.
☆275Updated this week
veronica320 / Faithful-COT
Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".
☆161Updated last year
jianggy / MPI
This repo contains code for our NeurIPS 2023 spotlight paper: Evaluating and Inducing Personality in Pre-trained Language Models
☆52Updated last year
jlin816 / dialop
DialOp: Decision-oriented dialogue environments for collaborative language agents
☆109Updated 8 months ago
DeLLMa / DeLLMa
Official Implementation of "DeLLMa: Decision Making Under Uncertainty with Large Language Models"
☆57Updated 8 months ago
meg-tong / sycophancy-eval
datasets from the paper "Towards Understanding Sycophancy in Language Models"
☆82Updated last year
Berkeley-NLP / Agent-Eval-Refine
Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]
☆138Updated 7 months ago
architsharma97 / dpo-rlaif
☆99Updated last year
sotopia-lab / sotopia
Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)
☆229Updated 3 weeks ago
jiangjiechen / auction-arena
Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction A…
☆45Updated last year
zjunlp / WKM
[NeurIPS 2024] Agent Planning with World Knowledge Model
☆142Updated 7 months ago
zitian-gao / SC-MCTS
Interpretable Contrastive Monte Carlo Tree Search Reasoning
☆49Updated 8 months ago
HowieHwong / MetaTool
[ICLR 2024] MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
☆89Updated last year
LoryPack / LLM-LieDetector
Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"
☆71Updated last year
microsoft / SmartPlay
SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. …
☆140Updated last year