Official code for paper: N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models
☆89Jan 14, 2026Updated last month
Alternatives and similar repositories for N3D-VLM
Users that are interested in N3D-VLM are comparing it to the libraries listed below
Sorting:
- [ICLR 2026] RefAny3D: 3D Asset-Referenced Diffusion Models for Image Generation☆30Feb 5, 2026Updated last month
- Code release for 'Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs' (NeurIPS 2025)☆30Oct 28, 2025Updated 4 months ago
- ☆36Jan 10, 2026Updated last month
- [NeurIPS 2025] LabelAny3D: Label Any Object 3D in the Wild☆122Jan 6, 2026Updated 2 months ago
- The official implementation of "DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation". (arXiv 2601.22153)☆151Jan 30, 2026Updated last month
- Official Code of SocioMind for CVPR 2024 paper "Digital Life Project: Autonomous 3D Characters with Social Intelligence"☆29Sep 9, 2024Updated last year
- From Flatland to Space (SPAR). Accepted to NeurIPS 2025 Datasets & Benchmarks. A large-scale dataset & benchmark for 3D spatial perceptio…☆78Jan 5, 2026Updated 2 months ago
- A 3D mesh viewer for vscode☆73Jul 4, 2025Updated 8 months ago
- ZwZ model family: SOTA fine-grained perception performace; ZoomBench: a new challenging perception benchmark☆79Feb 27, 2026Updated last week
- [3DV 2026] FastMesh: Efficient Artistic Mesh Generation via Component Decoupling☆116Nov 11, 2025Updated 3 months ago
- [ICCV 2025] NuiScene: Exploring Efficient Generation of Unbounded Outdoor Scenes☆89Oct 26, 2025Updated 4 months ago
- spatial intelligence; interactive 3D scene generation; world model☆71Feb 23, 2026Updated last week
- ☆37Apr 29, 2025Updated 10 months ago
- [ICCV 2025] InstaScene: Towards Complete 3D Instance Decomposition and Reconstruction from Cluttered Scenes☆115Jul 24, 2025Updated 7 months ago
- [NeurIPS'24] This repository is the implementation of "SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models"☆313Dec 14, 2024Updated last year
- [NeurIPS 2025] ARMesh: Autoregressive Mesh Generation via Next-Level-of-Detail Prediction☆61Jan 27, 2026Updated last month
- StableWorld: Towards Stable and Consistent Long Interactive Video Generation☆81Feb 3, 2026Updated last month
- Official code for ICCV2023 paper: Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis☆34Dec 27, 2023Updated 2 years ago
- Official implementation of "RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics"☆63Jan 19, 2026Updated last month
- [NeurIPS 2025] VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning☆74Dec 14, 2025Updated 2 months ago
- code release for HouseCrafter (ICCV 2025 Highlight)☆70Oct 23, 2025Updated 4 months ago
- Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…☆78Feb 13, 2026Updated 3 weeks ago
- [CVPR 2025] GenFusion: Closing the Loop between Reconstruction and Generation via Videos☆163Apr 22, 2025Updated 10 months ago
- ☆83Nov 10, 2025Updated 3 months ago
- Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”☆82Dec 10, 2025Updated 2 months ago
- Official implementation of FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment☆28Feb 24, 2026Updated last week
- Code for "BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation", ICCV 2025.☆101Oct 6, 2025Updated 5 months ago
- ☆18Aug 1, 2025Updated 7 months ago
- ☆14Updated this week
- [CVPR 2025 Award Candidate & Oral] TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion☆43Apr 24, 2025Updated 10 months ago
- Official codebase for the paper "Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space"☆63Dec 17, 2025Updated 2 months ago
- ☆37Mar 25, 2025Updated 11 months ago
- ☆61Nov 17, 2025Updated 3 months ago
- [ICCV 2025] Official code for AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation☆272Feb 25, 2026Updated last week
- Blender plugin for TRELLIS and TRELLIS.2 (3D AIGC Model, Text-to-3D, Image-to-3D)☆55Dec 29, 2025Updated 2 months ago
- ☆125Aug 28, 2025Updated 6 months ago
- Official repo for paper "HiMoE-VLA: Hierarchical Mixture-of-Experts for Generalist Vision-Language-Action Policies"☆23Dec 12, 2025Updated 2 months ago
- 3D Editing via Propagation of Image Prompts to Multi-View☆18Nov 30, 2025Updated 3 months ago
- ☆17Aug 5, 2025Updated 7 months ago