InternRobotics/MMSI-Bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/InternRobotics/MMSI-Bench)

InternRobotics / MMSI-Bench

[ICLR 2026] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence

☆103

Alternatives and similar repositories for MMSI-Bench

Users that are interested in MMSI-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

InternRobotics / MMSI-Video-Bench
View on GitHub
MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence
☆60Mar 11, 2026Updated 4 months ago
InternRobotics / InternSR
View on GitHub
InternRobotics' open-source toolbox for vision-based embodied spatial intelligence.
☆49Sep 18, 2025Updated 10 months ago
InternRobotics / OST-Bench
View on GitHub
[NeurIPS 2025] OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
☆79Sep 29, 2025Updated 9 months ago
mll-lab-nu / MindCube
View on GitHub
☆163Mar 23, 2026Updated 3 months ago
InternRobotics / CronusVLA
View on GitHub
[AAAI26 oral] CronusVLA: Towards Efficient and Robust Manipulation via Multi-Frame Vision-Language-Action Modeling
☆109Jan 11, 2026Updated 6 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
facebookresearch / Multi-SpatialMLLM
View on GitHub
[CVPR 2026] Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models
☆178Feb 25, 2026Updated 4 months ago
gca-spatial-reasoning / gca
View on GitHub
Official Implementation of "Geometrically-Constrained Agent for Spatial Reasoning"
☆89Apr 7, 2026Updated 3 months ago
InternRobotics / InternManip
View on GitHub
An All-in-one robot manipulation learning suite for policy models training and evaluation on various datasets and benchmarks.
☆175Oct 15, 2025Updated 9 months ago
EvolvingLMMs-Lab / EASI
View on GitHub
Holistic Evaluation of Multimodal LLMs on Spatial Intelligence
☆117Jul 1, 2026Updated 2 weeks ago
yangsizhe / MoVie
View on GitHub
[NeurIPS 2023] MoVie: Visual Model-Based Policy Adaptation for View Generalization
☆12Sep 22, 2023Updated 2 years ago
qizekun / OmniSpatial
View on GitHub
[ICLR 2026] OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models
☆88Jan 21, 2026Updated 5 months ago
haoningwu3639 / SpatialScore
View on GitHub
[CVPR 2026 Highlight] SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
☆84May 28, 2026Updated last month
Chenyu-Wang567 / All-Angles-Bench
View on GitHub
Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs
☆69Mar 22, 2026Updated 3 months ago
UMass-Embodied-AGI / MindJourney
View on GitHub
[NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"
☆151Nov 4, 2025Updated 8 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
OpenSenseNova / SenseNova-SI
View on GitHub
[CVPR 2026] Scaling Spatial Intelligence with Multimodal Foundation Models
☆289May 14, 2026Updated 2 months ago
InternRobotics / PPI
View on GitHub
[RSS 2025] Gripper Keypose and Object Pointflow as Interfaces for Bimanual Robotic Manipulation
☆79Jul 22, 2025Updated 11 months ago
THU-SI / Spatial-MLLM
View on GitHub
[NeurIPS 2025 Spotlight] Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
☆479Feb 5, 2026Updated 5 months ago
AntResearchNLP / ViLaSR
View on GitHub
[NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
☆98Jul 27, 2025Updated 11 months ago
KAIST-Visual-AI-Group / APC-VLM
View on GitHub
[ICCV 2025] Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation
☆66Sep 12, 2025Updated 10 months ago
InternRobotics / VLM-Grounder
View on GitHub
[CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
☆134May 22, 2025Updated last year
ByteDance-Seed / SpatialTree
View on GitHub
CVPR 2026 (Highlight); Spatial Intelligence; MLLMs
☆47Feb 24, 2026Updated 4 months ago
InternRobotics / InstructVLA
View on GitHub
[ICLR 2026] InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation
☆116Jan 27, 2026Updated 5 months ago
Yangr116 / VST
View on GitHub
[ECCV2026] Visual Spatial Tuning
☆198Mar 25, 2026Updated 3 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
InternRobotics / InternScenes
View on GitHub
[NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.
☆256Jul 7, 2026Updated last week
KAIST-Visual-AI-Group / Token-Warping-MLLM
View on GitHub
☆22Mar 31, 2026Updated 3 months ago
InternRobotics / EmbodiedScan
View on GitHub
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
☆672Jun 13, 2025Updated last year
InternRobotics / StreamVLN
View on GitHub
[ICRA 2026] Official implementation of the paper: "StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling"
☆554Nov 2, 2025Updated 8 months ago
mll-lab-nu / Awesome-Spatial-Intelligence-in-VLM
View on GitHub
A paper list for spatial reasoning
☆766Jan 19, 2026Updated 6 months ago
OuyangKun10 / SpaceR
View on GitHub
SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning
☆111Jul 9, 2025Updated last year
InternRobotics / G2VLM
View on GitHub
[CVPR 2026] G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
☆345Apr 18, 2026Updated 3 months ago
InternRobotics / InternHumanoid
View on GitHub
A versatile, all-in-one toolbox for whole-body humanoid robot control.
☆186Oct 10, 2025Updated 9 months ago
InternRobotics / Seer
View on GitHub
[ICLR 2025 Oral] Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
☆311Jul 8, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
MINT-SJTU / STI-Bench
View on GitHub
STI-Bench : Are MLLMs Ready for Precise Spatial-Temporal World Understanding?
☆39Jan 12, 2026Updated 6 months ago
LaVi-Lab / VG-LLM
View on GitHub
The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'
☆245Nov 28, 2025Updated 7 months ago
InternRobotics / OV_PARTS
View on GitHub
[NeurIPS 2023] OV-PARTS: Towards Open-Vocabulary Part Segmentation
☆95Jun 24, 2024Updated 2 years ago
VITA-Group / VLM-3R
View on GitHub
[CVPR 2026] VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
☆428Updated this week
zhangquanchen / 3DThinker
View on GitHub
[CVPR 2026] Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views
☆243May 7, 2026Updated 2 months ago
LogosRoboticsGroup / SPAR
View on GitHub
From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…
☆90Jan 5, 2026Updated 6 months ago
mulplue / MGF
View on GitHub
[Neurips'24] PyTorch Implementation of "MGF: Mixed Gaussian Flow for Diverse Trajectory Prediction".
☆24Apr 5, 2025Updated last year