W-Ted/N3D-VLM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/W-Ted/N3D-VLM)

W-Ted / N3D-VLM

Official code for paper: N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

☆117

Alternatives and similar repositories for N3D-VLM

Users that are interested in N3D-VLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LaVi-Lab / VG-LLM
View on GitHub
The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'
☆248Nov 28, 2025Updated 8 months ago
W-Ted / UDC-NeRF
View on GitHub
Official code for ICCV2023 paper: Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis
☆34Dec 27, 2023Updated 2 years ago
InternRobotics / G2VLM
View on GitHub
[CVPR 2026] G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
☆346Apr 18, 2026Updated 3 months ago
yanchi-3dv / PG-Occ
View on GitHub
[ICLR 2026] This is the official implementation of PG-Occ: Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocab…
☆34Feb 19, 2026Updated 5 months ago
Visual-AI / 3DRS
View on GitHub
[NeurIPS 2025] 3DRS: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding
☆158Dec 9, 2025Updated 7 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
VITA-Group / VLM-3R
View on GitHub
[CVPR 2026] VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
☆431Jul 15, 2026Updated 2 weeks ago
AIGeeksGroup / 3D-R1
View on GitHub
3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding
☆414Jul 20, 2026Updated last week
UVA-Computer-Vision-Lab / LabelAny3D
View on GitHub
[NeurIPS 2025] LabelAny3D: Label Any Object 3D in the Wild
☆130Jan 6, 2026Updated 6 months ago
Visionary-Laboratory / holi-spatial
View on GitHub
[ICML 2026 Oral] Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence
☆372Updated this week
fudan-zvg / UniUGG
View on GitHub
UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding. Accepted to ICLR 2026.
☆63Jul 16, 2026Updated last week
InternRobotics / MMSI-Video-Bench
View on GitHub
MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence
☆61Mar 11, 2026Updated 4 months ago
THU-SI / Spatial-MLLM
View on GitHub
[NeurIPS 2025 Spotlight] Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
☆480Feb 5, 2026Updated 5 months ago
LiuJF1226 / Mono4DGS-HDR
View on GitHub
[ICLR 2026] Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos
☆29May 29, 2026Updated 2 months ago
hanxunyu / DepthVLM
View on GitHub
🔥 Official code repository for "Unlocking Dense Metric Depth Estimation in VLMs"
☆155Jul 22, 2026Updated last week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Yangr116 / VST
View on GitHub
[ECCV2026] Visual Spatial Tuning
☆200Mar 25, 2026Updated 4 months ago
LaVi-Lab / Video-3D-LLM
View on GitHub
[CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.
☆220Jun 4, 2025Updated last year
Zhoues / RoboTracer
View on GitHub
[ECCV 2026] Official implementation of "RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics"
☆82Jun 18, 2026Updated last month
JIA-Lab-research / RePlan
View on GitHub
(ECCV2026) RePlan: Reasoning-Guided Region Planning for Complex Instruction-Based Image Editing
☆67Jul 1, 2026Updated 3 weeks ago
WU-CVGL / GS-Reasoner
View on GitHub
Reasoning in Space via Grounding in the World (ICLR 2025)
☆56Nov 3, 2025Updated 8 months ago
elainew728 / motion-edit
View on GitHub
Official Repository of paper: "MotionEdit: Benchmarking and Learning Motion-Centric Image Editing"
☆67Feb 28, 2026Updated 5 months ago
AnjieCheng / SpatialRGPT
View on GitHub
[NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"
☆336Dec 14, 2024Updated last year
facebookresearch / DepthLM_Official
View on GitHub
[ICLR 2026 Oral (top 1.2%)] Official implementation of DepthLM
☆363Jun 1, 2026Updated last month
Haochen-Wang409 / ross3d
View on GitHub
[ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
☆70Jul 22, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Livioni / OmniVGGT-official
View on GitHub
[CVPR 2026 Hightlight] OmniVGGT: Omni-Modality Driven Visual Geometry Grounded Transformer
☆352May 21, 2026Updated 2 months ago
AnjieCheng / SR-3D
View on GitHub
[ICLR'26] This repository is the implementation of "3D Aware Region Prompted Vision Language Model"
☆30Feb 19, 2026Updated 5 months ago
W-Ted / F3D-Gaus
View on GitHub
Official code for paper: F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Aggregative Gaussian Splatting
☆52Mar 11, 2025Updated last year
LogosRoboticsGroup / SPAR
View on GitHub
From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…
☆90Jan 5, 2026Updated 6 months ago
haoningwu3639 / SpatialScore
View on GitHub
[CVPR 2026 Highlight] SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
☆84May 28, 2026Updated 2 months ago
UVA-Computer-Vision-Lab / ovmono3d
View on GitHub
[3DV 2026] Open Vocabulary Monocular 3D Object Detection
☆98Apr 29, 2026Updated 3 months ago
facebookresearch / univlg
View on GitHub
Unifying 2D and 3D Vision-Language Understanding
☆127Jul 2, 2026Updated 3 weeks ago
yangcaoai / VGGT-Det-CVPR2026
View on GitHub
Official code for CVPR 2026 paper: VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection
☆145Jul 15, 2026Updated 2 weeks ago
CASAGPT / CASA-GPT
View on GitHub
PyTorch implementation of the paper: CASAGPT: Cuboid Arrangement and Scene Assembly for Interior Design [CVPR 2025]
☆15Apr 5, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
qizekun / OmniSpatial
View on GitHub
[ICLR 2026] OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models
☆89Jan 21, 2026Updated 6 months ago
TencentARC / DSR_Suite
View on GitHub
☆74Apr 21, 2026Updated 3 months ago
THU-SI / LangScene-X
View on GitHub
[ICCV 2025] LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion
☆302Jul 15, 2025Updated last year
MiZhenxing / One4D
View on GitHub
[ECCV 2026] One4D: Unified 4D Generation and Reconstruction
☆116Jun 18, 2026Updated last month
zhangzaibin / spagent
View on GitHub
SPAgent, a foundation agent for understanding, reasoning over, and operating within the physical and spatial world.
☆209Updated this week
microsoft / HiSpatial
View on GitHub
[CVPR 2026] HiSpatial: Taming Hierarchical 3D Spatial Understanding in Vision-Language Models
☆38Jul 2, 2026Updated 3 weeks ago
chenhaomingbob / CSC
View on GitHub
[CVPR 2024] This is official implementation of our CVPR 2024 paper "Building a Strong Pre-Training Baseline for Universal 3D Large-Scale …
☆17Jun 11, 2024Updated 2 years ago