InternRobotics / G2VLMLinks
G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
☆257Updated 2 weeks ago
Alternatives and similar repositories for G2VLM
Users that are interested in G2VLM are comparing it to the libraries listed below
Sorting:
- This is the repository that contains source code for the PhysGen3D.☆240Updated 4 months ago
- SAM 3D Objects with Multi-view Images☆188Updated last month
- ☆140Updated 10 months ago
- 🌐 3D and 4D World Modeling: A Survey☆783Updated 2 weeks ago
- [NeurIPS 25] TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels☆179Updated last month
- A Unified Driving World Model for Future Generation and Perception☆134Updated 6 months ago
- 4DNeX: Feed-Forward 4D Generative Modeling Made Easy☆818Updated last month
- Official Implementation of Paper [DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation]☆73Updated last month
- [CVPR2024] Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion☆136Updated last year
- [NeurIPS'25] Official repository of Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations☆491Updated 2 months ago
- [CORL 2025 Oral]One View, Many Worlds: Single-Image to 3D Object Meets Generative Domain Randomization for One-Shot 6D Pose Estimation.☆446Updated 5 months ago
- Official implementation of "Next-Scale Autoregressive Models are Zero-Shot Single-Image Object View Synthesizers"☆46Updated 10 months ago
- [NeurIPS 2025 DB Track] 3EED: Ground Everything Everywhere in 3D☆200Updated last month
- [SIGGRAPH Conference 2024] GaussianPrediction: Dynamic 3D Gaussian Prediction for Motion Extrapolation and Free View Synthesis☆157Updated 9 months ago
- [ICLR 2026] Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation☆378Updated this week
- [NeurIPS 2025 Spotlight] Towards Understanding Camera Motions in Any Video☆265Updated 2 months ago
- Official implementation of "ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation"☆85Updated last month
- OmniNWM: Omniscient Navigation World Models for Autonomous Driving☆269Updated 3 months ago
- Official code of Motus: A Unified Latent Action World Model☆597Updated 3 weeks ago
- [ICRA 2025] PUGS: Zero-shot Physical Understanding with Gaussian Splatting.☆104Updated 10 months ago
- [NeurIPS 2025 Spotlight] A Native Multimodal LLM for 3D Generation and Understanding☆538Updated 3 months ago
- RynnEC: Bringing MLLMs into Embodied World☆383Updated 3 months ago
- [ICCV 2025] DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation☆184Updated 5 months ago
- Are Video Models Ready as Zero-shot Reasoners?☆84Updated 2 months ago
- [AAAI 2026 🔥] Official implementation of "NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representation"☆176Updated 5 months ago
- [CVPR 2025, All Strong Accept] TSP3D: Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding☆249Updated 7 months ago
- [ICLR 2026] NewtonGen: Physics-Consistent and Controllable Text-to-Video Generation via Neural Newtonian Dynamics☆120Updated this week
- [ECCV2024] DreamScene: 3D Gaussian-based Text-to-3D Scene Generation via Formation Pattern Sampling☆229Updated last month
- [NeurIPS 2025] Official code of Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting☆142Updated last month
- DynamicTree: Interactive Real Tree Animation via Sparse Voxel Spectrum☆29Updated last month