InternRobotics / G2VLMLinks
G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
☆226Updated 3 weeks ago
Alternatives and similar repositories for G2VLM
Users that are interested in G2VLM are comparing it to the libraries listed below
Sorting:
- [NeurIPS 25] TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels☆106Updated this week
- This is the repository that contains source code for the PhysGen3D.☆233Updated 3 months ago
- SAM 3D Objects with Multi-view Images☆130Updated 2 weeks ago
- ☆140Updated 8 months ago
- 4DNeX: Feed-Forward 4D Generative Modeling Made Easy☆801Updated this week
- Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views☆107Updated last week
- [CVPR2024] Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion☆135Updated last year
- A Unified Driving World Model for Future Generation and Perception☆127Updated 4 months ago
- RynnEC: Bringing MLLMs into Embodied World☆382Updated last month
- [NeurIPS 2025 Spotlight] Towards Understanding Camera Motions in Any Video☆250Updated 3 weeks ago
- [NeurIPS 2025 DB Track] 3EED: Ground Everything Everywhere in 3D☆193Updated last week
- 🌐 3D and 4D World Modeling: A Survey☆695Updated this week
- [NeurIPS'25] Official repository of Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations☆450Updated 3 weeks ago
- [ICRA 2025] PUGS: Zero-shot Physical Understanding with Gaussian Splatting.☆102Updated 8 months ago
- OmniNWM: Omniscient Navigation World Models for Autonomous Driving☆263Updated last month
- [NeurIPS 2025 Spotlight] A Native Multimodal LLM for 3D Generation and Understanding☆515Updated 2 months ago
- [SIGGRAPH Conference 2024] GaussianPrediction: Dynamic 3D Gaussian Prediction for Motion Extrapolation and Free View Synthesis☆155Updated 8 months ago
- Are Video Models Ready as Zero-shot Reasoners?☆84Updated 3 weeks ago
- 🌐 WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World☆133Updated this week
- [AAAI 2026 🔥] Official implementation of "NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representation"☆174Updated 4 months ago
- [CORL 2025 Oral]One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation.☆431Updated 4 months ago
- Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation☆370Updated 3 weeks ago
- [CVPR 2025, All Strong Accept] TSP3D: Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding☆243Updated 6 months ago
- Official implementation of "Next-Scale Autoregressive Models are Zero-Shot Single-Image Object View Synthesizers"☆46Updated 9 months ago
- Official Implementation of Paper [DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation]☆66Updated last week
- [ECCV2024] DreamScene: 3D Gaussian-based Text-to-3D Scene Generation via Formation Pattern Sampling☆228Updated last week
- [NeurIPS 2025] Official code of Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting☆133Updated 2 weeks ago
- The official PyTorch implementation of Diffusion Time-step Curriculum for One Image to 3D Generation (CVPR 2024)☆74Updated last year
- Uncommon Objects in 3D dataset☆1,308Updated last month
- [ICCV2025 Highlight] Stereo Any Video: Temporally Consistent Stereo Matching☆377Updated 2 weeks ago