FoundationAgents/VR-Bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/FoundationAgents/VR-Bench)

FoundationAgents / VR-Bench

We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench shows that fine-tuned video models consistently outperform strong VLMs on long-horizon spatial planning tasks.

☆66

Alternatives and similar repositories for VR-Bench

Users that are interested in VR-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

FoundationAgents / AutoEnv
View on GitHub
Scaling Agentic Environments Automatically.
☆66Mar 26, 2026Updated 3 months ago
FoundationAgents / foundation-protocol
View on GitHub
A Python runtime for multi-entity AI collaboration — agents, humans, and tools on a shared protocol layer.
☆50Jun 18, 2026Updated last month
XiangJinyu / APrompt
View on GitHub
An automatic prompt iteration and optimization generator suitable for any scenario
☆16Jan 31, 2025Updated last year
Video-Reason / Awesome-Video-Reasoning
View on GitHub
This is a collection of recent papers on reasoning in video generation models.
☆165Updated this week
FoundationAgents / InteractComp
View on GitHub
☆22Jan 26, 2026Updated 5 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
thuml / MiniVeo3-Reasoner
View on GitHub
Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…
☆230Apr 13, 2026Updated 3 months ago
diaoquesang / GL-LCM
View on GitHub
[MICCAI 2025] GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images
☆17Mar 12, 2026Updated 4 months ago
FoundationAgents / SPO
View on GitHub
Self Supervised Prompt Optimization.
☆75Oct 19, 2025Updated 9 months ago
BRZ911 / Wrong-of-Thought
View on GitHub
[EMNLP 2024 Findings] Wrong-of-Thought: An Integrated Reasoning Framework with Multi-Perspective Verification and Wrong Information
☆13Oct 1, 2024Updated last year
thuml / Reasoning-Visual-World
View on GitHub
Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…
☆100Mar 9, 2026Updated 4 months ago
shiqichen17 / SPA
View on GitHub
Github repository for "Internalizing World Models via Self-Play Finetuning for Agentic RL"
☆36Nov 1, 2025Updated 8 months ago
InternLM / EndoCoT
View on GitHub
[ECCV 2026] An official implementation of "EndoCoT". Scaling endogenous Chain-of-Thought (CoT) reasoning in diffusion models for complex …
☆43Jun 26, 2026Updated 3 weeks ago
Ceaglex / LoVA
View on GitHub
The code and weight for LoVA. LoVA is a novel model for Long-form Video-to-Audio generation. Based on the Diffusion Transformer (DiT) arc…
☆16Feb 27, 2025Updated last year
thuml / RLVR-World
View on GitHub
Official repository for "RLVR-World: Training World Models with Reinforcement Learning" (NeurIPS 2025), https://arxiv.org/abs/2505.13934
☆269Oct 28, 2025Updated 8 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
yangluo7 / V-ReasonBench
View on GitHub
☆36Feb 18, 2026Updated 5 months ago
QuanyiLi / gwm-wiser
View on GitHub
☆17Jun 23, 2026Updated last month
LJH-coding / EDELINE
View on GitHub
[NeurIPS 2025 Spotlight] EDELINE: Enhancing Memory in Diffusion-based World Models via Linear-Time Sequence Modeling
☆19Oct 18, 2025Updated 9 months ago
FoundationAgents / ReCode
View on GitHub
Next paradigm for LLM Agent. Unify plan and action through recursive code generation for adaptive, human-like decision-making.
☆561Apr 21, 2026Updated 3 months ago
cheolhong0916 / contrastive-probing
View on GitHub
☆15Jun 19, 2026Updated last month
Evanwu1125 / LiteCoT
View on GitHub
☆17Jun 10, 2025Updated last year
Tencent / SelfEvolvingAgent
View on GitHub
Research works from Tencent AI Lab regarding self-evolving agents
☆97Jan 30, 2026Updated 5 months ago
VainF / In-Video-Instructions
View on GitHub
[Arxiv 2025] In-Video Instructions: Visual Signals as Generative Control
☆45Nov 25, 2025Updated 7 months ago
tongjingqi / Thinking-with-Video
View on GitHub
We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that S…
☆314Jun 21, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
LJungang / Awesome-Video-Reasoning-Landscape
View on GitHub
🔥An open-source survey of the latest video reasoning tasks, paradigms, and benchmarks.
☆189Jun 14, 2026Updated last month
Yui010206 / Adaptive-Visual-Imagination-Control
View on GitHub
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning
☆18Jun 2, 2026Updated last month
zlab-princeton / vero
View on GitHub
Vero: An Open RL Recipe for General Visual Reasoning
☆134Jun 19, 2026Updated last month
YuCrazing / ClothTransformer
View on GitHub
Code for paper "ClothTransformer: Unified Latent-Space Transformers for Scalable Cloth Simulation"
☆23Updated this week
LYFCLOUDFAN / mask-world-model
View on GitHub
Code for "Predicting What Matters: Robust Generalist Robot Policy Learning via Future Semantic Mask".
☆32Jun 8, 2026Updated last month
linYDTHU / StableVelocity
View on GitHub
[ICML 2026] Stable Velocity: A Variance Perspective on Flow Matching
☆29Feb 19, 2026Updated 5 months ago
diaoquesang / Code-in-Paper-Guide
View on GitHub
🌟 手把手教你在论文中插入代码链接
☆25Aug 2, 2025Updated 11 months ago
ZhenyangLiu / ReasonGrounder
View on GitHub
☆15Jul 11, 2025Updated last year
YiCheng98 / IntegrativeDecoding
View on GitHub
Official Implementation for the paper "Integrative Decoding: Improving Factuality via Implicit Self-consistency"
☆33Apr 12, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
visgym / VisGym
View on GitHub
Official Repository of VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents
☆114May 3, 2026Updated 2 months ago
vis-nlp / OpenCQA
View on GitHub
☆13Jun 20, 2023Updated 3 years ago
HL-hanlin / V-Co
View on GitHub
Official implementation of V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising (ECCV 2026)
☆27Jun 29, 2026Updated 3 weeks ago
dujh22 / AiMed
View on GitHub
AiMed面向中文医学的人工智能大语言模型期望实现有效处理医学知识问答、医学论文阅读、医学文献检索等任务和在医学科研中的应用。
☆13Feb 8, 2025Updated last year
injadlu / VCR
View on GitHub
☆13Feb 25, 2025Updated last year
ZJU-REAL / VerifyBench
View on GitHub
[ICLR 2026] VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models
☆21Feb 18, 2026Updated 5 months ago
Wakals / CoVT
View on GitHub
[ECCV 2026] Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"
☆376Apr 17, 2026Updated 3 months ago