WeichenZh / Open3DVQALinks

☆22

Alternatives and similar repositories for Open3DVQA

Users that are interested in Open3DVQA are comparing it to the libraries listed below

Sorting:

mll-lab-nu / MindCube
☆76Updated 3 weeks ago
MSR3D / MSR3D
[NeurIPS 2024] Official code repository for MSR3D paper
☆60Updated last month
sg-3d / sg3d
☆49Updated 9 months ago
OpenRobotLab / OST-Bench
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
☆54Updated this week
Zhoues / RoboRefer
Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"
☆98Updated last week
LaVi-Lab / Video-3D-LLM
[CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.
☆138Updated last month
ZCMax / ScanReason
[ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities
☆77Updated 9 months ago
YoujunZhao / OpenScan
OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding
☆18Updated 3 months ago
OuyangKun10 / SpaceR
SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning
☆69Updated last week
JeffWang987 / EgoVid
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation
☆109Updated 8 months ago
staymylove / 3DMIT
Code of 3DMIT: 3D MULTI-MODAL INSTRUCTION TUNING FOR SCENE UNDERSTANDING
☆30Updated 11 months ago
ATR-DBI / ScanQA
☆126Updated last year
yyyybq / Awesome-Spatial-Reasoning
A paper list for spatial reasoning
☆121Updated last month
fudan-zvg / spar
☆51Updated last month
UMass-Embodied-AGI / MultiPLY
Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
☆130Updated 8 months ago
Haochen-Wang409 / ross3d
Official implementation of "Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness".
☆44Updated 3 weeks ago
SilongYong / SQA3D
[ICLR 2023] SQA3D for embodied scene understanding and reasoning
☆135Updated last year
AnjieCheng / SpatialRGPT
[NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"
☆217Updated 7 months ago
OpenRobotLab / StreamVLN
Official implementation of the paper: "StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling"
☆105Updated this week
CurryYuan / ZSVG3D
[CVPR 2024] Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
☆54Updated 11 months ago
YunzeMan / Situation3D
[CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning
☆39Updated 7 months ago
qizekun / OmniSpatial
OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models
☆48Updated 2 weeks ago
AdaCheng / EgoThink
[CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Lan…
☆60Updated 3 months ago
Zhangwenyao1 / DreamVLA
DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge
☆71Updated this week
mengfeidu / EmbSpatial-Bench
☆18Updated last year
BIT-DYN / OpenObj
[RAL 2024] OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding
☆27Updated 5 months ago
OpenRobotLab / Grounded_3D-LLM
Code&Data for Grounded 3D-LLM with Referent Tokens
☆123Updated 6 months ago
OpenRobotLab / VLM-Grounder
[CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
☆109Updated last month
MTU3D / MTU3D
☆64Updated last week
OpenRobotLab / MMSI-Bench
[arXiv 2025] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence
☆43Updated last week