marcelbinz / CENTaURLinks
☆25Updated last year
Alternatives and similar repositories for CENTaUR
Users that are interested in CENTaUR are comparing it to the libraries listed below
Sorting:
- ☆44Updated last week
- ☆18Updated 10 months ago
- ☆32Updated this week
- Hypothetical Minds is an autonomous LLM-based agent for diverse multi-agent settings, integrating a Theory of Mind module Theory of Mind …☆30Updated 10 months ago
- ☆69Updated last year
- We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.☆117Updated last year
- Governance of the Commons Simulation (GovSim)☆48Updated 4 months ago
- ☆93Updated 11 months ago
- Language of thought library for python 3☆49Updated last year
- Evaluating the Moral Beliefs Encoded in LLMs☆26Updated 5 months ago
- Machine Theory of Mind Reading List. Built upon EMNLP Findings 2023 Paper: Towards A Holistic Landscape of Situated Theory of Mind in Lar…☆131Updated 3 months ago
- maze datasets for investigating OOD behavior of ML systems☆46Updated last week
- ☆55Updated 6 months ago
- MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research☆21Updated 2 months ago
- Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL☆51Updated 7 months ago
- 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆55Updated last year
- ☆21Updated last year
- Interpreting how transformers simulate agents performing RL tasks☆82Updated last year
- DialOp: Decision-oriented dialogue environments for collaborative language agents☆106Updated 6 months ago
- General-Sum variant of the game Diplomacy for evaluating AIs.☆29Updated last year
- Code and data for People construct simplified mental representations to plan☆23Updated last year
- ☆128Updated last year
- Models of Sequential Decision-Making☆49Updated 4 months ago
- ☆37Updated 8 months ago
- Super fast implementations of common benchmark text world games☆47Updated 2 months ago
- Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).☆234Updated 7 months ago
- An OpenAI gym environment to evaluate the ability of LLMs (eg. GPT-4, Claude) in long-horizon reasoning and task planning in dynamic mult…☆68Updated 2 years ago
- A benchmark for evaluating learning agents based on just language feedback☆79Updated 2 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆147Updated 4 months ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆76Updated last year