MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence
☆56Mar 11, 2026Updated last week
Alternatives and similar repositories for MMSI-Video-Bench
Users that are interested in MMSI-Video-Bench are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding☆73Sep 29, 2025Updated 5 months ago
- the official repo for EMNLP 2024 (main) paper "EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimo…☆20Apr 9, 2025Updated 11 months ago
- InternRobotics' open-source toolbox for vision-based embodied spatial intelligence.☆48Sep 18, 2025Updated 6 months ago
- Code release for 'Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs' (NeurIPS 2025)☆30Oct 28, 2025Updated 4 months ago
- OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams☆47Updated this week
- ☆22Mar 7, 2025Updated last year
- Implementation of <Symbolic Graphics Programming with Large Language Models>☆38Sep 14, 2025Updated 6 months ago
- [NeurIPS 2023] MoVie: Visual Model-Based Policy Adaptation for View Generalization☆11Sep 22, 2023Updated 2 years ago
- [ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models☆24Jan 1, 2026Updated 2 months ago
- [ICLR 2026] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence☆78Mar 13, 2026Updated last week
- This is a PyTorch implementation of 3DRefTR proposed by our paper "A Unified Framework for 3D Point Cloud Visual Grounding"☆26Aug 24, 2023Updated 2 years ago
- Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…☆83Mar 9, 2026Updated last week
- ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Ablation Capability for Large Vision-Language Models☆16Sep 27, 2024Updated last year
- Understand what physics/algorithms do transformers learn internally when trained on planetary motion☆39Feb 9, 2026Updated last month
- [NeurIPS 2025] Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking☆24Updated this week
- ☆66Jan 6, 2026Updated 2 months ago
- Official implementation of "What does CLIP know about a red circle? Visual Prompt Engineering for VLMs", ICCV 2023☆11Sep 21, 2023Updated 2 years ago
- Cambrian-S: Towards Spatial Supersensing in Video☆514Dec 27, 2025Updated 2 months ago
- PaperBot: Learning to Design Real-World Tools Using Paper☆13Mar 15, 2024Updated 2 years ago
- ☆11Mar 22, 2024Updated last year
- Software Engineering Economy | Tongji Univ. SSE Course Design☆11Sep 19, 2020Updated 5 years ago
- This repo provides methods for building and evaluating Retrieval Augmented Generation (RAG) systems.☆18Sep 25, 2024Updated last year
- Video Reasoning Segmentation☆28Nov 29, 2024Updated last year
- Official code of DMA: Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding, ECCV 2024☆31Jul 18, 2024Updated last year
- Models and code for the ICLR 2020 workshop paper "Towards Understanding Normalization in Neural ODEs"☆16Apr 27, 2020Updated 5 years ago
- A simple Read-It-Later and link collection tool, AI-powered for text and images, multi-platform, open-source. A browser extension availab…☆11May 13, 2025Updated 10 months ago
- [SIGGRAPH Asia 2025] Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization☆35Nov 30, 2025Updated 3 months ago
- ☆33Nov 26, 2025Updated 3 months ago
- Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces☆88Jun 6, 2025Updated 9 months ago
- Official training code for MUG-V 10B video generation model. Built on Megatron-LM (v0.14.0) with production-ready distributed training fo…☆19Oct 20, 2025Updated 5 months ago
- Blending Custom Photos with Video Diffusion Transformers☆48Jan 21, 2025Updated last year
- The official codes of Learning to Decouple the Lights for 3D Face Texture Modeling (NeurIPS'24)☆14Mar 17, 2025Updated last year
- ☆13Mar 9, 2024Updated 2 years ago
- collab-dev - Collaboration Metrics for Code Reviews☆23May 12, 2025Updated 10 months ago
- ☆17Apr 17, 2025Updated 11 months ago
- ☆26Nov 25, 2023Updated 2 years ago
- This is the official PyTorch implementation for DiffHDR: Towards High-quality HDR Deghosting with Conditional Diffusion Models (TCSVT'202…☆21Feb 12, 2024Updated 2 years ago
- ☆14Jul 23, 2024Updated last year
- ☆27Feb 29, 2024Updated 2 years ago