ToMBench: Benchmarking Theory of Mind in Large Language Models, ACL 2024.
☆64Jun 24, 2024Updated last year
Alternatives and similar repositories for ToMBench
Users that are interested in ToMBench are comparing it to the libraries listed below
Sorting:
- ToMATO: Verbalizing the Mental States of Role-Playing LLMs for Benchmarking Theory of Mind (AAAI2025)☆19Apr 16, 2025Updated 10 months ago
- ☆22Nov 8, 2023Updated 2 years ago
- [ACL24] EmoBench: Evaluating the Emotional Intelligence of Large Language Models☆109May 16, 2025Updated 9 months ago
- 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆59May 31, 2024Updated last year
- Code and data for the paper: On the Reliability of Psychological Scales on Large Language Models☆30Dec 15, 2025Updated 2 months ago
- MuMA-ToM: Multi-modal Multi-Agent Theory of Mind☆38Jan 23, 2025Updated last year
- The Implementation of "Machine Theory of Mind", ICML 2018☆27Mar 14, 2022Updated 3 years ago
- ☆13Jul 17, 2024Updated last year
- Testing Theory of Mind (ToM) in language models with epistemic logic☆22Dec 13, 2023Updated 2 years ago
- [ACL 2024 Findings] The official repo for "ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large …☆24May 29, 2024Updated last year
- ☆15Aug 13, 2020Updated 5 years ago
- ☆23Oct 31, 2018Updated 7 years ago
- Official code for ICML 2024 paper on Persona In-Context Learning (PICLe)☆26Jun 27, 2024Updated last year
- ☆28Nov 22, 2019Updated 6 years ago
- Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)☆281Jan 23, 2026Updated last month
- ☆48Aug 12, 2025Updated 6 months ago
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆81May 7, 2024Updated last year
- A Multi-Session and Multi-Therapy Benchmark for High-Realism AI Psychological Counselor☆29Jan 13, 2026Updated last month
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆96Aug 20, 2024Updated last year
- ☆49Apr 4, 2025Updated 10 months ago
- Twitter-NFT sales bot that tweets individual and sweep sales with images from Opensea, Looksrare, X2Y2, and Blur using Opensea/Looksrare …☆13Jul 27, 2023Updated 2 years ago
- COMMS Software for UPSat☆12Dec 17, 2018Updated 7 years ago
- ☆26Updated this week
- This is the official implementation for MA-LoT.☆19Aug 4, 2025Updated 6 months ago
- test images with not appropriate labels in MNIST dataset☆10Mar 3, 2018Updated 7 years ago
- the datasets of our paper☆11Feb 26, 2024Updated 2 years ago
- ☆16Jun 25, 2025Updated 8 months ago
- ☆12Feb 22, 2021Updated 5 years ago
- Machine learning for molecules workshop 2022☆13Nov 30, 2022Updated 3 years ago
- [AAAI22] CEM: Commonsense-aware Empathetic Response Generation☆95May 16, 2025Updated 9 months ago
- Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"☆41Sep 24, 2024Updated last year
- ☆39Aug 9, 2022Updated 3 years ago
- ☆42Nov 21, 2023Updated 2 years ago
- This is a repository for sharing papers in the field of empathetic conversational AI. The related source code for each paper is linked if…☆268Apr 17, 2024Updated last year
- A collection of works that investigate social agents, simulations and their real-world impact in text, embodied, and robotics contexts.☆109Jun 3, 2024Updated last year
- ☆12Apr 2, 2025Updated 10 months ago
- The repo for using the model https://huggingface.co/thu-coai/Attacker-v0.1☆13Apr 23, 2025Updated 10 months ago
- Benchmarking Deepseek R1 API response speeds across different providers for performance comparison.☆10Feb 15, 2025Updated last year
- ☆10Mar 19, 2024Updated last year