mukobi / welfare-diplomacyLinks
General-Sum variant of the game Diplomacy for evaluating AIs.
☆29Updated last year
Alternatives and similar repositories for welfare-diplomacy
Users that are interested in welfare-diplomacy are comparing it to the libraries listed below
Sorting:
- ☆138Updated last month
- ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.☆288Updated last month
- Governance of the Commons Simulation (GovSim)☆57Updated 7 months ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆89Updated last year
- Interpreting how transformers simulate agents performing RL tasks☆87Updated last year
- ☆99Updated last year
- An extensible benchmark for evaluating large language models on planning☆400Updated 2 months ago
- ☆210Updated 2 years ago
- Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).☆237Updated 2 weeks ago
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback☆119Updated 5 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆116Updated last year
- Code for our NeurIPS'24 Dataset and Benchmark paper: Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiatio…☆37Updated 9 months ago
- We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.☆124Updated last year
- Reasoning with Language Model is Planning with World Model☆170Updated 2 years ago
- Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)☆241Updated 2 weeks ago
- SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. …☆140Updated last year
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆189Updated 4 months ago
- This repository contains a LLM benchmark for the social deduction game `Resistance Avalon'☆125Updated 3 months ago
- Hypothetical Minds is an autonomous LLM-based agent for diverse multi-agent settings, integrating a Theory of Mind module Theory of Mind …☆35Updated last year
- ☆143Updated last year
- LLM experiments done during SERI MATS - focusing on activation steering / interpreting activation spaces☆95Updated last year
- ControlArena is a collection of settings, model organisms and protocols - for running control experiments.☆90Updated this week
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆28Updated last year
- ☆122Updated last year
- ☆29Updated last year
- ☆295Updated last year
- ☆238Updated 11 months ago
- Formal Contracts for Multi-Agent Reinforcement Learning☆19Updated last year
- The Prism Alignment Project☆79Updated last year
- Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agen…☆284Updated 2 years ago