zhchen18 / ToMBench
ToMBench: Benchmarking Theory of Mind in Large Language Models, ACL 2024.
☆31Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for ToMBench
- Resources for our ACL 2023 paper: Distilling Script Knowledge from Large Language Models for Constrained Language Planning☆35Updated last year
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆20Updated 8 months ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆47Updated 4 months ago
- ☆28Updated 9 months ago
- ☆36Updated 10 months ago
- ☆65Updated 6 months ago
- ☆27Updated 9 months ago
- Code and data for "Dialogue Planning via Brownian Bridge Stochastic Process for Goal-directed Proactive Dialogue" (ACL Findings 2023).☆21Updated last year
- Code and data for "Target-oriented Proactive Dialogue Systems with Personalization: Problem Formulation and Dataset Curation" (EMNLP 2023…☆28Updated 6 months ago
- This code accompanies the paper DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering.☆18Updated last year
- ☆26Updated last year
- PyTorch implementation of experiments in the paper Aligning Language Models with Human Preferences via a Bayesian Approach☆30Updated last year
- ☆33Updated 2 years ago
- ☆83Updated last year
- Codes for Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback (ACL 2024 Findings)☆12Updated 4 months ago
- WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…☆46Updated 11 months ago
- ☆46Updated 10 months ago
- ☆36Updated 7 months ago
- ☆12Updated last month
- ☆42Updated last year
- Machine Theory of Mind Reading List. Built upon EMNLP Findings 2023 Paper: Towards A Holistic Landscape of Situated Theory of Mind in Lar…☆101Updated 9 months ago
- ☆59Updated last year
- [ACL 2023] Solving Math Word Problems via Cooperative Reasoning induced Language Models☆42Updated 11 months ago
- ☆80Updated last year
- Code and Results of the Paper Titled: Revisiting the Reliability of Psychological Scales on Large Language Models☆29Updated last month
- Codes and data for ACL 2023 Findings paper "Click: Controllable Text Generation with Sequence Likelihood Contrastive Learning"☆15Updated 8 months ago
- [ACL 2023] Learning Multi-step Reasoning by Solving Arithmetic Tasks. https://arxiv.org/abs/2306.01707☆23Updated last year
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆42Updated 2 months ago
- ☆24Updated last year
- ☆31Updated 3 weeks ago