ShijieZhou-UCLA / VLM4DLinks
[ICCV 2025] VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
☆34Updated last month
Alternatives and similar repositories for VLM4D
Users that are interested in VLM4D are comparing it to the libraries listed below
Sorting:
- Public code for XFactor: Introduces the first geometry-free model to achieve true self-supervised / pose-free Novel View Synthesis (NVS) …☆84Updated 2 months ago
- [NeurIPS 2025] Streaming 3D Reconstruction with Explicit Spatial Pointer Memory☆174Updated 3 months ago
- [NeurIPS 2025 Spotlight] Official implementation of the SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alig…☆154Updated 3 months ago
- UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation☆134Updated 7 months ago
- ☆121Updated 6 months ago
- [Arxiv'24] LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding☆39Updated 4 months ago
- StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams☆69Updated 7 months ago
- The official implementation of InfiniteVGGT☆116Updated this week
- ☆67Updated last year
- "VicaSplat: A Single Run is All You Need for 3D Gaussian Splatting and Camera Estimation from Unposed Video Frames"☆89Updated 5 months ago
- [ICCV 2025] ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting☆95Updated last month
- [ICLR 2025] Where Am I and What Will I See : An Auto-Regressive Model for Spatial Localization and View Prediction☆42Updated 5 months ago
- ☆15Updated last year
- ☆39Updated 9 months ago
- [ICCV 2025] This is the official implementation of POMATO: Marrying Pointmap Matching with Temporal Motions for Dynamic 3D Reconstruction☆116Updated 5 months ago
- Official Implementation of paper "St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World"☆100Updated 3 months ago
- ☆21Updated last year
- Localized Gaussian Point Management☆80Updated 6 months ago
- Code for Faster VGGT with Block-Sparse Global Attention☆88Updated last month
- Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image☆57Updated 2 weeks ago
- Official implementation of paper "G4Splat: Geometry-Guided Gaussian Splatting with Generative Prior"☆71Updated 2 months ago
- [NeurIPS 2025]"DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling"☆90Updated 3 weeks ago
- Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”☆70Updated last month
- UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding☆58Updated 4 months ago
- Code for "BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation", ICCV 2025.☆100Updated 3 months ago
- Official implementation of ICCV25 paper "Trace3D: Consistent Segmentation Lifting via Gaussian Instance Tracing"☆25Updated 4 months ago
- [ICLR 2025] MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow☆24Updated 9 months ago
- [EMNLP 2025 Findings] 3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation☆30Updated 6 months ago
- [NeurIPS 2025] ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS☆157Updated last month
- Official implementation of EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting☆51Updated 6 months ago