microsoft / tale-suiteLinks
Text Adventure Learning Environment Suite - Benchmark to evaluate language models on interactive text environments.
☆25Updated last week
Alternatives and similar repositories for tale-suite
Users that are interested in tale-suite are comparing it to the libraries listed below
Sorting:
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity☆47Updated 2 years ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆102Updated 2 years ago
- ☆123Updated 11 months ago
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆81Updated last year
- Self-playing Adversarial Language Game Enhances LLM Reasoning, NeurIPS 2024☆143Updated 11 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆124Updated last year
- ☆99Updated last year
- Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Ziha…☆132Updated last year
- ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.☆336Updated 2 months ago
- Self-Alignment with Principle-Following Reward Models☆169Updated 4 months ago
- ☆114Updated 9 months ago
- Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.☆154Updated 5 months ago
- Benchmarking Agentic LLM and VLM Reasoning On Games☆228Updated 2 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆185Updated 8 months ago
- Can Language Models Solve Olympiad Programming?☆123Updated last year
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆72Updated 11 months ago
- [ICLR 2025] "Training LMs on Synthetic Edit Sequences Improves Code Synthesis" (Piterbarg, Pinto, Fergus)☆19Updated 11 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆124Updated last year
- [NeurIPS'24 Spotlight] Observational Scaling Laws☆58Updated last year
- Code and example data for the paper: Rule Based Rewards for Language Model Safety☆205Updated last year
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆147Updated last year
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆116Updated 2 years ago
- ☆144Updated 6 months ago
- ☆110Updated last year
- Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)☆278Updated 2 weeks ago
- This repo contains code for our NeurIPS 2023 spotlight paper: Evaluating and Inducing Personality in Pre-trained Language Models☆57Updated 2 years ago
- DialOp: Decision-oriented dialogue environments for collaborative language agents☆111Updated last year
- ☆107Updated last year
- ☆203Updated 9 months ago
- ☆99Updated last year