FoundationAgents / VR-BenchLinks
We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench shows that fine-tuned video models consistently outperform strong VLMs on long-horizon spatial planning tasks.
☆46Updated this week
Alternatives and similar repositories for VR-Bench
Users that are interested in VR-Bench are comparing it to the libraries listed below
Sorting:
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆216Updated 2 months ago
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agents☆238Updated 3 weeks ago
- [ICLR 2025] Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality☆57Updated 5 months ago
- [CVPR' 25] Interleaved-Modal Chain-of-Thought☆97Updated 2 weeks ago
- Official code for the paper: DRA-GRPO: Exploring Diversity-Aware Reward Adjustment for R1-Zero-Like Training of Large Language Models☆21Updated 5 months ago
- ☆346Updated 4 months ago
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆294Updated last month
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆86Updated 10 months ago
- ☆135Updated 9 months ago
- 🔥🔥🔥Latest Papers, Codes on Uncertainty-based RL☆55Updated 3 months ago
- The paper list of "Memory in the Age of AI Agents: A Survey"☆243Updated this week
- Official Repository of "Learning what reinforcement learning can't"☆70Updated last month
- Survey on Data-centric Large Language Models☆88Updated last year
- Offical Repository of "AtomThink: Multimodal Slow Thinking with Atomic Step Reasoning"☆57Updated last month
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated last week
- ☆121Updated 3 weeks ago
- A Sober Look at Language Model Reasoning☆89Updated last month
- ☆89Updated last week
- [ICML'25 Oral] Multi-agent Architecture Search via Agentic Supernet☆223Updated last month
- repo for paper https://arxiv.org/abs/2504.13837☆288Updated 5 months ago
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆86Updated 5 months ago
- Towards a Unified View of Large Language Model Post-Training☆192Updated 3 months ago
- ☆70Updated last year
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆398Updated 5 months ago
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆373Updated 3 weeks ago
- [ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆80Updated last month
- Official repository for the paper Number Cookbook: Number Understanding of Language Models and How to Improve It.☆19Updated 8 months ago
- The implementation for ICLR 2025 Oral: From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions.☆52Updated 4 months ago
- ☆173Updated last week
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model☆62Updated last month