InternRobotics/OST-Bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/InternRobotics/OST-Bench)

InternRobotics / OST-Bench

[NeurIPS 2025] OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

☆79

Alternatives and similar repositories for OST-Bench

Users that are interested in OST-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

InternRobotics / MMSI-Video-Bench
View on GitHub
MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence
☆60Mar 11, 2026Updated 4 months ago
InternRobotics / InternSR
View on GitHub
InternRobotics' open-source toolbox for vision-based embodied spatial intelligence.
☆49Sep 18, 2025Updated 10 months ago
InternRobotics / MMSI-Bench
View on GitHub
[ICLR 2026] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence
☆103Apr 28, 2026Updated 2 months ago
InternRobotics / StreamVLN
View on GitHub
[ICRA 2026] Official implementation of the paper: "StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling"
☆554Nov 2, 2025Updated 8 months ago
KaiyueSun98 / T2I-Personalization-with-AR
View on GitHub
☆47Apr 20, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
TencentARC / GRPO-CARE
View on GitHub
[ACL2026 Findings] GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning
☆83Jun 23, 2025Updated last year
Karine-Huang / GenMAC
View on GitHub
[AAAI 2026] GenMAC for Compositional Text-to-Video Generation
☆35Jan 10, 2026Updated 6 months ago
InternRobotics / G2VLM
View on GitHub
[CVPR 2026] G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
☆345Apr 18, 2026Updated 3 months ago
KaiyueSun98 / T2I-ReasonBench
View on GitHub
T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image Generation
☆37Sep 16, 2025Updated 10 months ago
InternRobotics / InternManip
View on GitHub
An All-in-one robot manipulation learning suite for policy models training and evaluation on various datasets and benchmarks.
☆175Oct 15, 2025Updated 9 months ago
InternLM / OVO-S-Bench
View on GitHub
An official implementation of "OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs"
☆47Jun 24, 2026Updated 3 weeks ago
song2yu / SIBench-VSR
View on GitHub
This is a project on visual spatial reasoning tasks-SIBench
☆27Jan 12, 2026Updated 6 months ago
InternRobotics / CronusVLA
View on GitHub
[AAAI26 oral] CronusVLA: Towards Efficient and Robust Manipulation via Multi-Frame Vision-Language-Action Modeling
☆109Jan 11, 2026Updated 6 months ago
InternRobotics / EmbodiedScan
View on GitHub
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
☆672Jun 13, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
InternRobotics / VL-LN
View on GitHub
VL-LN Bench: Towards Long-horizon Goal-oriented Navigation with Active Dialogs
☆57Jan 5, 2026Updated 6 months ago
HKU-MMLab / OmniX
View on GitHub
Official implementation of "OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes".
☆100Mar 31, 2026Updated 3 months ago
sg-3d / sg3d
View on GitHub
☆55Oct 3, 2024Updated last year
ZCMax / LLaVA-3D
View on GitHub
[ICCV 2025] A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World
☆384Oct 21, 2025Updated 8 months ago
OmniMMI / OmniMMI
View on GitHub
[CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts
☆23Updated this week
LogosRoboticsGroup / SPAR
View on GitHub
From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…
☆90Jan 5, 2026Updated 6 months ago
facebookresearch / Multi-SpatialMLLM
View on GitHub
[CVPR 2026] Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models
☆178Feb 25, 2026Updated 4 months ago
ShijieZhou-UCLA / VLM4D
View on GitHub
[ICCV 2025] VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
☆55Nov 20, 2025Updated 8 months ago
InternRobotics / Grounded_3D-LLM
View on GitHub
Code&Data for Grounded 3D-LLM with Referent Tokens
☆136Jan 5, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
LaVi-Lab / VG-LLM
View on GitHub
The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'
☆245Nov 28, 2025Updated 7 months ago
Haochen-Wang409 / ross3d
View on GitHub
[ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
☆70Jul 22, 2025Updated 11 months ago
VITA-Group / VLM-3R
View on GitHub
[CVPR 2026] VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
☆428Updated this week
InternRobotics / InternScenes
View on GitHub
[NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.
☆256Jul 7, 2026Updated last week
YuqingWang1029 / CubiD
View on GitHub
[CVPR2026 Highlight] Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens https://arxiv.org/abs…
☆63Apr 10, 2026Updated 3 months ago
ESI-Bench / ESI-Bench
View on GitHub
☆116Updated this week
InternRobotics / OV_PARTS
View on GitHub
[NeurIPS 2023] OV-PARTS: Towards Open-Vocabulary Part Segmentation
☆95Jun 24, 2024Updated 2 years ago
THU-SI / Spatial-MLLM
View on GitHub
[NeurIPS 2025 Spotlight] Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
☆479Feb 5, 2026Updated 5 months ago
TencentARC / SEED-Bench-R1
View on GitHub
☆100Jun 23, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
OpenSenseNova / SenseNova-SI
View on GitHub
[CVPR 2026] Scaling Spatial Intelligence with Multimodal Foundation Models
☆289May 14, 2026Updated 2 months ago
Yangr116 / VST
View on GitHub
[ECCV2026] Visual Spatial Tuning
☆198Mar 25, 2026Updated 3 months ago
dengandong / GroundMoRe
View on GitHub
☆18May 18, 2026Updated 2 months ago
AntResearchNLP / ViLaSR
View on GitHub
[NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
☆98Jul 27, 2025Updated 11 months ago
qiulu66 / Anime-Shooter
View on GitHub
☆55Jun 4, 2025Updated last year
yangsizhe / MoVie
View on GitHub
[NeurIPS 2023] MoVie: Visual Model-Based Policy Adaptation for View Generalization
☆12Sep 22, 2023Updated 2 years ago
mlvlab / ST-VLM
View on GitHub
☆13Mar 28, 2025Updated last year