EgoToM is an egocentric theory-of-mind benchmark built on Ego4D videos, containing multi-choice questions that evaluate multimodal large language models' ability to infer a camera wearer's goals, in-the-moment belief states, and future actions.
☆13Apr 1, 2025Updated 11 months ago
Alternatives and similar repositories for EgoToM
Users that are interested in EgoToM are comparing it to the libraries listed below
Sorting:
- [ICLR 26] Official Implementation of MaskInversion☆30Updated this week
- [ICCV 2025] Neurons: Emulating the Human Visual Cortex Improves Fidelity and Interpretability in fMRI-to-Video Reconstruction☆22Oct 27, 2025Updated 4 months ago
- Multi-modality Hierarchical Recall based on GBDTs for Bipolar Disorder Classification☆10Jul 12, 2023Updated 2 years ago
- #2019 Micro-expression Grand Challeng☆12Dec 23, 2019Updated 6 years ago
- ☆31Sep 19, 2025Updated 5 months ago
- ☆32Feb 13, 2026Updated 2 weeks ago
- ☆22Jan 12, 2026Updated last month
- Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency☆60Jun 6, 2025Updated 8 months ago
- R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning☆29Feb 9, 2026Updated 3 weeks ago
- ☆13Apr 23, 2025Updated 10 months ago
- [NeurIPS 2025] Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking☆22Oct 22, 2025Updated 4 months ago
- ☆12Jul 16, 2024Updated last year
- [ICCV 2025] Object-centric Video Question Answering with Visual Grounding and Referring☆24Aug 8, 2025Updated 6 months ago
- A reconstruction framework for materializing subjective experiences from brain signals☆13Jan 18, 2025Updated last year
- ☆10Jun 12, 2023Updated 2 years ago
- This repository contains the Adverbs in Recipes (AIR) dataset and the code published at the CVPR 23 paper: "Learning Action Changes by Me…☆13May 25, 2023Updated 2 years ago
- ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding☆17Aug 8, 2025Updated 6 months ago
- Computed Appraisals Model. Code and data for the 2023 paper, "Emotion prediction as computation over a generative theory of mind"☆13Jun 12, 2023Updated 2 years ago
- [NeurIPS'23] ConDaFormer: Disassembled Transformer with Local Structure Enhancement for 3D Point Cloud Understanding☆12Dec 9, 2023Updated 2 years ago
- DARMA: Software for Dual Axis Rating and Media Annotation☆12Nov 28, 2022Updated 3 years ago
- ☆30Feb 15, 2026Updated 2 weeks ago
- code for the paper Imitation Learning from Observation with Automatic Discount Scheduling☆13Mar 27, 2024Updated last year
- This code is submitted to ICCV Workshop 2017: Fake vs. true facial emotion recognition competition☆11Oct 17, 2017Updated 8 years ago
- ☆24Nov 20, 2025Updated 3 months ago
- This code submission for the ICCV 17 Real Versus Fake Expressed Emotion Challenge provides source code to extract the features and classi…☆11Aug 28, 2017Updated 8 years ago
- Toolkit for TRoVE, for generating synthetic dataset from real-world annotations and scenes. Accepted at #ECCV2022☆12Jul 20, 2022Updated 3 years ago
- Detect corn stalks for micro-sensor insertion☆13Mar 5, 2024Updated last year
- Official code base for "Long-Tailed Diffusion Models With Oriented Calibration" ICLR2024☆15Jul 11, 2024Updated last year
- Official implementation of EgoThinker at NIPS 2025☆24Nov 25, 2025Updated 3 months ago
- AAAI 2024-Controllable Mind Visual Diffusion Model☆16Dec 18, 2023Updated 2 years ago
- [CVPR 2025] Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors☆16Jun 6, 2025Updated 8 months ago
- [ICCV 2023] This is for the paper "Deep Homography Mixture for Single Image Rolling Shutter Correction".☆13May 25, 2025Updated 9 months ago
- Data release for Step Differences in Instructional Video (CVPR24)☆14Jun 19, 2024Updated last year
- Simple demo showing how to use dlib face detection and alignment in MATLAB☆11Nov 22, 2018Updated 7 years ago
- Official PyTorch codebase for the Modeling Caption Diversity in ContrastiveVision-Language Pretraining paper.☆18Mar 28, 2025Updated 11 months ago
- Dataset and evaluation benchmark for Privacy Leakage Evaluation of Autonomous Web Agents☆35Feb 21, 2026Updated last week
- This is a quick patch for compiling OpenCV 2.4.x with CUDA 9.☆13Apr 17, 2018Updated 7 years ago
- Code For Our Work: DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries [ECCV-2024]☆14Jul 11, 2024Updated last year
- A Novel Apex-Time Network for Cross-Dataset Micro-Expression Recognition☆13Jan 11, 2022Updated 4 years ago