SCAI-JHU / AutoToMLinks
AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind
☆22Updated last month
Alternatives and similar repositories for AutoToM
Users that are interested in AutoToM are comparing it to the libraries listed below
Sorting:
- This the implementation of LeCo☆31Updated 5 months ago
- Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data☆38Updated 4 months ago
- Evaluate the Quality of Critique☆35Updated last year
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery☆89Updated 2 weeks ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆62Updated 11 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆48Updated 7 months ago
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆50Updated 2 weeks ago
- Recent papers on (1) Psychology of LLMs; (2) Biases in LLMs.☆49Updated last year
- Personality Alignment of Language Models☆37Updated 3 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆32Updated last year
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆99Updated last month
- ☆38Updated last year
- official implementation of paper "Process Reward Model with Q-value Rankings"☆59Updated 4 months ago
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆59Updated 6 months ago
- ☆35Updated 3 months ago
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆15Updated 2 months ago
- ☆59Updated 9 months ago
- [ACL 2024] The project of Symbol-LLM☆55Updated 11 months ago
- [ACL 2024] Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models☆21Updated 11 months ago
- Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)☆32Updated 11 months ago
- PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion☆54Updated last year
- Machine Theory of Mind Reading List. Built upon EMNLP Findings 2023 Paper: Towards A Holistic Landscape of Situated Theory of Mind in Lar…☆132Updated 4 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆72Updated 3 months ago
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…☆60Updated 2 years ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 4 months ago
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆65Updated last year
- ☆19Updated last year
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆25Updated 3 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆95Updated 2 weeks ago
- ☆17Updated 2 years ago