FoundationAgents / VR-BenchLinks
We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench shows that fine-tuned video models consistently outperform strong VLMs on long-horizon spatial planning tasks.
☆50Updated last month
Alternatives and similar repositories for VR-Bench
Users that are interested in VR-Bench are comparing it to the libraries listed below
Sorting:
- [EMNLP 2025 Main] AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time☆89Updated 8 months ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆148Updated 8 months ago
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agents☆298Updated last week
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆58Updated last week
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated last week
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆129Updated 6 months ago
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆250Updated 3 months ago
- ☆193Updated 3 months ago
- [NeurIPS 2025 D&B (Spotlight🌟)] TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenario☆29Updated 4 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆71Updated 8 months ago
- ☆16Updated 3 months ago
- [ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant☆44Updated last year
- A Self-Training Framework for Vision-Language Reasoning☆88Updated last year
- Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models☆277Updated 6 months ago
- Official implementation of MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems☆73Updated 7 months ago
- ☆76Updated 3 months ago
- [ICLR 2026] The official repository for the paper "AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning".☆63Updated 2 weeks ago
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆53Updated 4 months ago
- Official repository for ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use☆29Updated 3 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Updated 8 months ago
- ☆43Updated 5 months ago
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model☆64Updated 3 months ago
- ☆134Updated 2 weeks ago
- [NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning☆149Updated 4 months ago
- [CVPR' 25] Interleaved-Modal Chain-of-Thought☆106Updated last month
- ☆352Updated 6 months ago
- [ICML'25 Oral] Multi-agent Architecture Search via Agentic Supernet☆239Updated 2 months ago
- Official Repo for MageBench: Bridging Large Multimodal Models to Agents☆22Updated last year
- Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents☆220Updated 9 months ago
- Official Repository of LatentSeek☆76Updated 8 months ago