cicl-stanford/procedural-evals-tom

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cicl-stanford/procedural-evals-tom)

cicl-stanford / procedural-evals-tom

☆40

Alternatives and similar repositories for procedural-evals-tom

Users that are interested in procedural-evals-tom are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

skywalker023 / fantom
View on GitHub
👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"
☆62May 31, 2024Updated 2 years ago
seacowx / OpenToM
View on GitHub
The official repository of the OpenToM dataset
☆33Feb 2, 2025Updated last year
nttmdlab-nlp / ToMATO
View on GitHub
ToMATO: Verbalizing the Mental States of Role-Playing LLMs for Benchmarking Theory of Mind (AAAI2025)
☆20Apr 16, 2025Updated last year
shawnsihyunlee / simulatedtom
View on GitHub
Public repository for "Think Twice: Perspective-Taking Improves Large Language Models’ Theory-of-Mind Capabilities".
☆25Aug 16, 2023Updated 2 years ago
ying-hui-he / Hi-ToM_dataset
View on GitHub
☆21Oct 11, 2025Updated 9 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
salavi / Clever_Hans_or_N-ToM
View on GitHub
☆12May 6, 2024Updated 2 years ago
facebookresearch / ToMi
View on GitHub
Code accompanying our EMNLP 2019 paper: "Revisiting the Evaluation of Theory of Mind through Question Answering"
☆29Aug 9, 2020Updated 5 years ago
songys / 2021Langcon
View on GitHub
☆11Oct 3, 2021Updated 4 years ago
zhchen18 / ToMBench
View on GitHub
ToMBench: Benchmarking Theory of Mind in Large Language Models, ACL 2024.
☆68Jun 24, 2024Updated 2 years ago
julianje / Bishop
View on GitHub
Mental state inference from observable behavior
☆15Dec 3, 2021Updated 4 years ago
Mars-tin / awesome-theory-of-mind
View on GitHub
Machine Theory of Mind Reading List. Built upon EMNLP Findings 2023 Paper: Towards A Holistic Landscape of Situated Theory of Mind in Lar…
☆155Jun 11, 2026Updated last month
CHATS-lab / KokoMind
View on GitHub
KokoMind: Can LLMs Understand Social Interactions?
☆104Oct 3, 2023Updated 2 years ago
MicroSTM / AGENT-synthesis
View on GitHub
Data synthesis code for "AGENT: A Benchmark for Core Psychological Reasoning"
☆24Mar 3, 2022Updated 4 years ago
yrf1 / LLM-MassiveMulticultureNormsKnowledge-NCLB
View on GitHub
☆20Mar 12, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
naver-ai / cs-shortcut
View on GitHub
Saving Dense Retriever from Shortcut Dependency in Conversational Search (EMNLP 2022)
☆18Nov 24, 2022Updated 3 years ago
sileod / llm-theory-of-mind
View on GitHub
Testing Theory of Mind (ToM) in language models with epistemic logic
☆22Jul 3, 2026Updated 3 weeks ago
msclar / symbolictom
View on GitHub
☆23Nov 8, 2023Updated 2 years ago
sled-group / MindCraft
View on GitHub
Official code for our EMNLP2021 Outstanding Paper MindCraft: Theory of Mind Modeling for Situated Dialogue in Collaborative Tasks
☆21May 18, 2023Updated 3 years ago
kayburns / tom-qa-dataset
View on GitHub
☆24Oct 31, 2018Updated 7 years ago
kanishkg / boxing-gym
View on GitHub
☆12Jul 30, 2025Updated 11 months ago
Dahoas / QDSyntheticData
View on GitHub
☆14Aug 15, 2024Updated last year
McGill-NLP / latent-translation
View on GitHub
Code for the paper "Modelling Latent Translations for Cross-Lingual Transfer"
☆17Nov 22, 2021Updated 4 years ago
noiseQA / NoiseQA
View on GitHub
☆12Feb 22, 2021Updated 5 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
sotopia-lab / sotopia-pi
View on GitHub
Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)
☆85May 7, 2024Updated 2 years ago
yookyungkho / DSBA_CS224N_2021
View on GitHub
"CS224n 2021 winter" study - KoreaUniv. DSBA Lab
☆15Apr 18, 2022Updated 4 years ago
neurospin / pypreclin
View on GitHub
☆13Feb 3, 2020Updated 6 years ago
jianggy / MPI
View on GitHub
This repo contains code for our NeurIPS 2023 spotlight paper: Evaluating and Inducing Personality in Pre-trained Language Models
☆59Dec 7, 2023Updated 2 years ago
bhargaviparanjape / explainable_qa
View on GitHub
Implementation for https://arxiv.org/abs/2005.00652
☆27Dec 8, 2022Updated 3 years ago
asgordon / EtcAbductionPy
View on GitHub
An implementation of Etcetera Abduction in Python
☆11Jul 15, 2026Updated 2 weeks ago
zengyan-97 / Transformer-DST
View on GitHub
A Generative Dialogue State Tracking Model
☆23Jun 24, 2021Updated 5 years ago
xianweiz / Python.paper.figures
View on GitHub
Generate publication-quality figures using python
☆23Jun 5, 2016Updated 10 years ago
sabithsn / APPDIA-Discourse-Style-Transfer
View on GitHub
Data and code for APPDIA: A Discourse-aware Transformer-based Style Transfer Model for Offensive Social Media Conversations (COLING 2022)…
☆13Sep 8, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
anirgalgali / residual-dynamics
View on GitHub
Code for Galgali et al, 2023
☆14Jan 11, 2023Updated 3 years ago
mojsaeed / RuleBert
View on GitHub
☆20Mar 30, 2022Updated 4 years ago
kztakemoto / mmllm
View on GitHub
Moral Machine Experiment on LLMs
☆11Jun 17, 2026Updated last month
HKUNLP / multilingual-transfer
View on GitHub
Code for paper ”Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability“
☆15Jun 13, 2023Updated 3 years ago
nlp-waseda / mtl-eadrg
View on GitHub
Emotion-Aware Dialogue Response Generation by Multi-Task Learning
☆13Jan 22, 2022Updated 4 years ago
DSBA-Lab / CodeLab
View on GitHub
DSBA code study
☆30Nov 7, 2023Updated 2 years ago
cultural-csk / candle
View on GitHub
Extracting Cultural Commonsense Knowledge at Scale (WWW 2023)
☆11Feb 15, 2024Updated 2 years ago