neu-vi/struct2d

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/neu-vi/struct2d)

neu-vi / struct2d

Code release for 'Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs' (NeurIPS 2025)

☆31

Alternatives and similar repositories for struct2d

Users that are interested in struct2d are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

neu-vi / FleVRS
View on GitHub
FleVRS: Towards Flexible Visual Relationship Segmentation, NeurIPS 2024
☆22Dec 9, 2024Updated last year
neu-vi / Diag-HOI
View on GitHub
☆27Aug 17, 2023Updated 2 years ago
zoezheng126 / Spatio-Temporal-LLM
View on GitHub
☆19Aug 7, 2025Updated 11 months ago
haoningwu3639 / SpatialScore
View on GitHub
[CVPR 2026 Highlight] SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
☆84May 28, 2026Updated last month
UCSB-AI / via-video
View on GitHub
☆25May 12, 2026Updated 2 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
MINT-SJTU / STI-Bench
View on GitHub
STI-Bench : Are MLLMs Ready for Precise Spatial-Temporal World Understanding?
☆39Jan 12, 2026Updated 6 months ago
Haochen-Wang409 / ross3d
View on GitHub
[ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
☆70Jul 22, 2025Updated 11 months ago
Chenyu-Wang567 / All-Angles-Bench
View on GitHub
Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs
☆69Mar 22, 2026Updated 3 months ago
wufeim / SpatialReasonerDataGen
View on GitHub
Synthetic VQA data generation code for SpatialReasoner.
☆20Nov 25, 2025Updated 7 months ago
beacon-3d / Beacon3D
View on GitHub
[CVPR 2025] Beacon3D: Object-centric Evaluation for 3D Grounding-QA
☆28Nov 25, 2025Updated 7 months ago
mll-lab-nu / MindCube
View on GitHub
☆163Mar 23, 2026Updated 3 months ago
wangsen99 / LMEE
View on GitHub
(CVPR 26) Explore with Long-term Memory: A Benchmark and Multimodal LLM-based Reinforcement Learning Framework for Embodied Exploration
☆35Mar 8, 2026Updated 4 months ago
liziwennba / SURPRISE3D
View on GitHub
☆22Apr 14, 2026Updated 3 months ago
LogosRoboticsGroup / SPAR
View on GitHub
From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…
☆90Jan 5, 2026Updated 6 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Zhanpeng1202 / pySpatial
View on GitHub
[ICLR 26] pySpatial: Generating 3D Visual Programs for Zero-Shot Spatial Reasoning
☆28Jun 2, 2026Updated last month
hany01rye / tiger
View on GitHub
TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
☆23Nov 18, 2025Updated 8 months ago
yashbhalgat / Contrastive-Lift
View on GitHub
[NeurIPS 2023 Spotlight] Code for "Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion"
☆73Nov 3, 2023Updated 2 years ago
Sy-Zhang / MMC-PCFG
View on GitHub
Video-aided Unsupervised Grammar Induction, NAACL‘21 [best long paper]
☆40Oct 27, 2022Updated 3 years ago
gbliao / SPC-GS
View on GitHub
[CVPR25] SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs
☆20Aug 27, 2025Updated 10 months ago
qizekun / OmniSpatial
View on GitHub
[ICLR 2026] OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models
☆88Jan 21, 2026Updated 6 months ago
CASAGPT / CASA-GPT
View on GitHub
PyTorch implementation of the paper: CASAGPT: Cuboid Arrangement and Scene Assembly for Interior Design [CVPR 2025]
☆15Apr 5, 2025Updated last year
HanchenTai / OV-SAM3D
View on GitHub
Open-Vocabulary SAM3D: Understand Any 3D Scene
☆44Jun 9, 2025Updated last year
yqi19 / BEAR-official
View on GitHub
The official repository of BEAR: Benchmarking and Enhancing Multimodal Language Models with Atomic Embodied Capabilities
☆25Updated this week
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
UCSB-AI / EditRoom
View on GitHub
[ICLR 2025] EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
☆27Apr 1, 2025Updated last year
nianticlabs / placeit3d
View on GitHub
[ICCV 2025] PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes
☆63Oct 3, 2025Updated 9 months ago
PeiwenSun2000 / SpaceVista
View on GitHub
The official repo for SpaceVista: All-Scale Visual Spatial Reasoning from mm to km.
☆43May 26, 2026Updated last month
Show-han / Zeroshot_REC
View on GitHub
Official code for Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions (CVPR 2024)
☆28Jun 21, 2024Updated 2 years ago
zvict / papr
View on GitHub
[NeurIPS 2023] Implementation of "PAPR: Proximity Attention Point Rendering"
☆40Jul 8, 2026Updated last week
ai4ce / MSG
View on GitHub
[NeurIPS2024] Multiview Scene Graph (topologically representing a scene from unposed images by interconnected place and object nodes)
☆130Sep 26, 2025Updated 9 months ago
NVlabs / RelViT
View on GitHub
[ICLR 2022] RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning
☆62Sep 10, 2022Updated 3 years ago
UMass-Embodied-AGI / MindJourney
View on GitHub
[NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"
☆151Nov 4, 2025Updated 8 months ago
OuyangKun10 / SpaceR
View on GitHub
SpaceR: The first MLLM empowered by SG-RLVR for video spatial reasoning
☆111Jul 9, 2025Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
HCPLab-SYSU / EXPRESS-Bench
View on GitHub
Embodied Question Answering (EQA) benchmark and method (ICCV 2025)
☆60Aug 12, 2025Updated 11 months ago
neu-vi / SNAP
View on GitHub
Interactive Point Cloud Segmentation (3DV 2026 Best Paper Candidate)
☆70May 9, 2026Updated 2 months ago
ZJU-REAL / ViewSpatial-Bench
View on GitHub
[ECCV 2026] ViewSpatial-Bench:Evaluating Multi-perspective Spatial Localization in Vision-Language Models
☆82Mar 9, 2026Updated 4 months ago
fudan-zvg / UniUGG
View on GitHub
UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding. Accepted to ICLR 2026.
☆63Updated this week
LaVi-Lab / VG-LLM
View on GitHub
The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'
☆245Nov 28, 2025Updated 7 months ago
amap-cvlab / AstraNav-Memory
View on GitHub
[Official] AstraNav-Memory: Contexts Compression for Long Memory. An image-centric memory framework for lifelong embodied navigation via …
☆81Jan 21, 2026Updated 6 months ago
lifuguan / LangSurf
View on GitHub
[Arxiv'24] LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding
☆44Aug 18, 2025Updated 11 months ago