W-Ted / N3D-VLMView external linksLinks
Official code for paper: N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models
☆85Jan 14, 2026Updated 3 weeks ago
Alternatives and similar repositories for N3D-VLM
Users that are interested in N3D-VLM are comparing it to the libraries listed below
Sorting:
- Code release for 'Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs' (NeurIPS 2025)☆30Oct 28, 2025Updated 3 months ago
- [NeurIPS 2025] LabelAny3D: Label Any Object 3D in the Wild☆119Jan 6, 2026Updated last month
- The official implementation of "DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation". (arXiv 2601.22153)☆118Jan 30, 2026Updated 2 weeks ago
- Official Code of SocioMind for CVPR 2024 paper "Digital Life Project: Autonomous 3D Characters with Social Intelligence"☆29Sep 9, 2024Updated last year
- spatial intelligence; interactive 3D scene generation; world model☆62Dec 16, 2025Updated last month
- ☆35Apr 29, 2025Updated 9 months ago
- [3DV 2026] FastMesh: Efficient Artistic Mesh Generation via Component Decoupling☆116Nov 11, 2025Updated 3 months ago
- A 3D mesh viewer for vscode☆73Jul 4, 2025Updated 7 months ago
- [ICCV 2025] NuiScene: Exploring Efficient Generation of Unbounded Outdoor Scenes☆89Oct 26, 2025Updated 3 months ago
- [ICCV 2025] InstaScene: Towards Complete 3D Instance Decomposition and Reconstruction from Cluttered Scenes☆114Jul 24, 2025Updated 6 months ago
- [NeurIPS 2025] ARMesh: Autoregressive Mesh Generation via Next-Level-of-Detail Prediction☆60Jan 27, 2026Updated 2 weeks ago
- [NeurIPS'25] HyRF: Hybrid Radiance Fields for Efficient and High-quality Novel View Synthesis☆69Dec 17, 2025Updated last month
- Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…☆69Jan 28, 2026Updated 2 weeks ago
- ☆56Aug 5, 2025Updated 6 months ago
- StableWorld: Towards Stable and Consistent Long Interactive Video Generation☆76Feb 3, 2026Updated last week
- Official code for ICCV2023 paper: Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis☆34Dec 27, 2023Updated 2 years ago
- [NeurIPS 2025] VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning☆72Dec 14, 2025Updated last month
- code release for HouseCrafter (ICCV 2025 Highlight)☆64Oct 23, 2025Updated 3 months ago
- Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”☆79Dec 10, 2025Updated 2 months ago
- [CVPR 2025] GenFusion: Closing the Loop between Reconstruction and Generation via Videos☆161Apr 22, 2025Updated 9 months ago
- [CVPR 2025 Award Candidate & Oral] TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion☆42Apr 24, 2025Updated 9 months ago
- ☆117Jan 28, 2026Updated 2 weeks ago
- ☆17Aug 1, 2025Updated 6 months ago
- ☆14Jan 9, 2026Updated last month
- ☆37Mar 25, 2025Updated 10 months ago
- Code for "BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation", ICCV 2025.☆101Oct 6, 2025Updated 4 months ago
- ☆60Nov 17, 2025Updated 2 months ago
- [ICCV 2025] Official code for AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation☆269Jan 8, 2026Updated last month
- ☆121Aug 28, 2025Updated 5 months ago
- FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection☆24Jan 13, 2026Updated last month
- FeatureBench: Benchmarking Agentic Coding for Complex Feature Development [ICLR 2026]☆18Updated this week
- The official code of "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"☆81Oct 15, 2025Updated 3 months ago
- Official implementation of "Imaginarium: Vision-guided High-quality 3D Scene Layout Generation"☆41Dec 30, 2025Updated last month
- [CVPR 2025🎉] Official implementation for paper "Point-Level Visual Affordance Guided Retrieval and Adaptation for Cluttered Garments Man…☆43Mar 25, 2025Updated 10 months ago
- 3D Editing via Propagation of Image Prompts to Multi-View☆19Nov 30, 2025Updated 2 months ago
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.☆28Dec 30, 2025Updated last month
- ☆17Aug 5, 2025Updated 6 months ago
- WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes☆106Mar 19, 2025Updated 10 months ago
- SpeedVision is an AI-powered tool that detects and calculates vehicle speed from video footage using YOLO-based object detection and fram…☆10Sep 22, 2024Updated last year