LaVi-Lab / VG-LLMLinks
The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'
☆21Updated this week
Alternatives and similar repositories for VG-LLM
Users that are interested in VG-LLM are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Where Am I and What Will I See : An Auto-Regressive Model for Spatial Localization and View Prediction☆35Updated 3 months ago
- ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS☆60Updated this week
- [ICLR 2025] MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow☆22Updated last month
- VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction☆115Updated this week
- ☆21Updated 3 weeks ago
- ☆34Updated last year
- Official implementation of "E3D-Bench: A Benchmark for End-to-End 3D Geometric Foundation Models"☆50Updated this week
- Official code for paper: F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Aggregative Gaussian Splatting☆42Updated 2 months ago
- This is the project page of ShowRoom3D☆25Updated last year
- Official pytorch implementation for "Training-Free Hierarchical Scene Understanding for Gaussian Splatting with Superpoint Graphs"☆18Updated last month
- ☆47Updated last week
- Code for "BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation", arXiv 2025.☆62Updated last month
- [CVPR 2025] Official code for the paper "SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis"☆83Updated 2 months ago
- ☆17Updated 5 months ago
- [Arxiv'24] LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding☆31Updated 3 months ago
- (CVPR 2024) NViST: In the wild New View Synthesis from a Single Image with Transformers☆41Updated 8 months ago
- [CVPR 2025 Oral] FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video☆37Updated this week
- Amodal Depth Anything: Amodal Depth Estimation in the Wild☆30Updated 4 months ago
- [CVPR 2024] 🏡Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning☆78Updated last year
- The official implementation for "Recollection from Pensieve: Novel View Synthesis via Learning from Uncalibrated Videos".☆34Updated 2 weeks ago
- ☆10Updated 7 months ago
- Prometheus: 3D-Aware Latent Diffusion Models for Feed-Forward Text-to-3D Scene Generation☆92Updated 5 months ago
- Official Implementation of VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Jo…☆18Updated 2 months ago
- [NeurIPS'2024] Zero-Shot Scene Reconstruction from Single Images with Deep Prior Assembly☆57Updated 6 months ago
- GLS: Geometry-aware 3D Language Gaussian Splatting☆39Updated 3 months ago
- Code for "Free360: Layered Gaussian Splatting for Unbounded 360-Degree View Synthesis from Extremely Sparse and Unposed Views", CVPR 2025☆36Updated 2 months ago
- [ICLR 2024] This is the official implementation of our paper "Semantic Flow: Learning Semantic Fields of Dynamic Scenes from Monocular Vi…☆10Updated 8 months ago
- [CVPR 2025 Highlight] MeshGen: Generating PBR Textured Mesh with Render-Enhanced Auto-Encoder and Generative Data Augmentation☆42Updated 3 weeks ago
- WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes☆89Updated 2 months ago
- [NeurIPS 2024] AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos☆22Updated 6 months ago