We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench shows that fine-tuned video models consistently outperform strong VLMs on long-horizon spatial planning tasks.
☆53Feb 4, 2026Updated 3 weeks ago
Alternatives and similar repositories for VR-Bench
Users that are interested in VR-Bench are comparing it to the libraries listed below
Sorting:
- When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought☆27Feb 14, 2026Updated 2 weeks ago
- [MICCAI 2025] GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images☆15Oct 2, 2025Updated 5 months ago
- More reliable Video Understanding Evaluation☆14Sep 23, 2025Updated 5 months ago
- The first open-domain closed-loop revisited benchmark for evaluating memory consistency and action control in world models.☆41Feb 10, 2026Updated 3 weeks ago
- Measuring RAG solutions throughput and latency☆19Jul 23, 2024Updated last year
- ☆50Dec 11, 2025Updated 2 months ago
- Scaling Agentic Environments Automatically.☆51Jan 22, 2026Updated last month
- Multi-step AI agents powered by Gemini 2.0 and the LangGraph framework. These agents orchestrate complex workflows and enhance their reas…☆10Dec 19, 2024Updated last year
- [ICRA 2026] StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes☆20Feb 17, 2026Updated 2 weeks ago
- MiniMax-Provider-Verifier offers a rigorous, vendor-agnostic way to verify whether third-party deployments of the Minimax M2 model are co…☆28Feb 18, 2026Updated last week
- Code2Worlds: Empowering Coding LLMs for 4D World Generation☆79Updated this week
- Software to enable data-rich collaboration from high-resolution display walls to your laptop☆16Feb 19, 2026Updated last week
- AI-native knowledge kernel for human/agent collaboration. Use it as a Knowledge Base, Wiki, Annotator, Research Tool, or Agentic Memory.☆29Updated this week
- Martingale posterior neural networks for fast sequential decision making @ Neurips 2025☆23Nov 13, 2025Updated 3 months ago
- ☆92Dec 30, 2025Updated 2 months ago
- ☆213Dec 19, 2025Updated 2 months ago
- Tusk Drift Demo - Node.js Service☆58Jan 20, 2026Updated last month
- Community maintained hardware plugin for vLLM on AWS Neuron☆23Updated this week
- Auction Theory Toolbox – Computer Verified Auctions☆14Jul 12, 2016Updated 9 years ago
- Directed masked autoencoders☆14Feb 20, 2026Updated last week
- A lightweight and highly extensible Agent framework☆20Jan 20, 2026Updated last month
- DragMesh: Interactive 3D Generation Made Easy☆20Dec 28, 2025Updated 2 months ago
- Fast, free, easy, and object-agnostic video anonymization☆11Dec 12, 2020Updated 5 years ago
- MCP server for Grok AI API integration☆21Jun 2, 2025Updated 9 months ago
- ☆17Aug 5, 2025Updated 6 months ago
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.☆28Dec 30, 2025Updated 2 months ago
- SkillX.sh — The Only Skill That Your AI Agent Needs. AI agent skills marketplace with semantic search, leaderboard, ratings, and CLI.☆24Feb 13, 2026Updated 2 weeks ago
- Benchmark evaluating ocean forecasting systems against reference datasets and observations.☆26Updated this week
- A multi-agent framework to help with your homework.☆10Mar 1, 2025Updated last year
- [MICCAI 2025] Bridging the Gap in Missing Modalities: Leveraging Knowledge Distillation and Style Matching for Brain Tumor Segmentation☆18Jul 13, 2025Updated 7 months ago
- ☆13Oct 21, 2024Updated last year
- ☆31Feb 3, 2026Updated 3 weeks ago
- A powerful integration that combines Browserbase's Stagehand with Mastra for advanced web automation, scraping, and AI-powered web intera…☆33Feb 4, 2026Updated 3 weeks ago
- ☆24Dec 19, 2025Updated 2 months ago
- OLD Codebase for Intelligent Systems 2020 and Project AI, Vrije Universiteit Amsterdam☆12Jan 10, 2023Updated 3 years ago
- The open-source language model computer☆10Mar 22, 2024Updated last year
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 8 months ago
- ☆20Updated this week
- ☆24Oct 3, 2025Updated 4 months ago