SCAI-JHU / AutoToM
AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind
☆16Updated last month
Alternatives and similar repositories for AutoToM:
Users that are interested in AutoToM are comparing it to the libraries listed below
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery☆78Updated last week
- ☆17Updated last year
- Machine Theory of Mind Reading List. Built upon EMNLP Findings 2023 Paper: Towards A Holistic Landscape of Situated Theory of Mind in Lar…☆127Updated 2 months ago
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆62Updated 11 months ago
- ☆30Updated 4 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆133Updated 4 months ago
- Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.☆25Updated last year
- PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion☆53Updated last year
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆84Updated 3 weeks ago
- [NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI☆99Updated last month
- ☆62Updated 2 weeks ago
- Official Implementation of the Baby-AIGS system☆23Updated 4 months ago
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learning☆90Updated last week
- ☆31Updated last year
- Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!☆39Updated last week
- ☆19Updated last year
- ☆33Updated last month
- ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment☆54Updated 10 months ago
- List of papers on Self-Correction of LLMs.☆72Updated 3 months ago
- ☆14Updated last week
- ☆78Updated this week
- Tasks for describing differences between text distributions.☆16Updated 8 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆48Updated 5 months ago
- ☆11Updated 5 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆58Updated 4 months ago
- Toy implementation of Strawberry☆31Updated 6 months ago
- The source code for running LLMs on the AAAR-1.0 benchmark.☆16Updated last week
- Evaluation on Logical Reasoning and Abstract Reasoning Challenges☆26Updated last year
- official implementation of paper "Process Reward Model with Q-value Rankings"☆54Updated 2 months ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆101Updated 2 months ago