ShijieZhou-UCLA / VLM4DLinks
[ICCV 2025] VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
☆37Updated 2 months ago
Alternatives and similar repositories for VLM4D
Users that are interested in VLM4D are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025 Spotlight] Official implementation of the SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alig…☆156Updated 4 months ago
- Official implementation of Video-DPM☆134Updated last week
- Public code for XFactor: Introduces the first geometry-free model to achieve true self-supervised / pose-free Novel View Synthesis (NVS) …☆89Updated 3 months ago
- Code for "BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation", ICCV 2025.☆101Updated 3 months ago
- Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”☆76Updated last month
- [Arxiv'24] LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding☆40Updated 5 months ago
- StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams☆73Updated 7 months ago
- [NeurIPS 24] The implementation and dataset of LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and…☆60Updated 10 months ago
- Official implementation of ICCV 2025 paper "EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds".☆45Updated 7 months ago
- [NeurIPS 2025] Streaming 3D Reconstruction with Explicit Spatial Pointer Memory☆177Updated 4 months ago
- ☆39Updated 10 months ago
- ☆123Updated 7 months ago
- UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation☆135Updated 7 months ago
- [CVPR 2024 Highlight] GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding☆26Updated last year
- UniUGG: Unified 3D Understanding and Generation via Geometric-Semantic Encoding☆58Updated 5 months ago
- Official implementation of GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs.☆69Updated 6 months ago
- Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image☆63Updated last month
- Official implementation of EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting☆54Updated 7 months ago
- "VicaSplat: A Single Run is All You Need for 3D Gaussian Splatting and Camera Estimation from Unposed Video Frames"☆91Updated 6 months ago
- Official PyTorch implementation for "Training-Free Hierarchical Scene Understanding for Gaussian Splatting with Superpoint Graphs"☆36Updated 6 months ago
- [NeurIPS 2025] InternScenes: A Large-scale Interactive Indoor Scene Dataset with Realistic Layouts.☆219Updated 3 months ago
- [ICCV 2025] ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting☆100Updated 2 months ago
- Official Implementation of paper "St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World"☆106Updated 4 months ago
- Official implementation of ICCV25 paper "Trace3D: Consistent Segmentation Lifting via Gaussian Instance Tracing"☆31Updated 4 months ago
- [NeurIPS 2025] PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation☆109Updated last week
- [NeurIPS 2025]"DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling"☆92Updated last month
- OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling☆417Updated 3 weeks ago
- Official code for paper: "RayRoPE: Projective Ray Positional Encoding for Multi-view Attention"☆34Updated last week
- ☆67Updated last year
- "E-RayZer: Self-supervised 3D Reconstruction as Spatial Visual Pre-training" official implementation.☆237Updated last month