Shark0-0 / VG4DLinks
Implementation of the paper: VG4D: Vision-Language Model Goes 4D Video Recognition(ICRA 2024)
☆15Updated last year
Alternatives and similar repositories for VG4D
Users that are interested in VG4D are comparing it to the libraries listed below
Sorting:
- This is the project page of ShowRoom3D☆25Updated last year
- [ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.☆61Updated 10 months ago
- Official implementation of PARIS3D (Accepted to ECCV 2024).☆26Updated 10 months ago
- [CVPR 2025] 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs☆45Updated last year
- Multi-SpatialMLLM Multi-Frame Spatial Understanding with Multi-Modal Large Language Models☆142Updated 2 months ago
- Self-reimplemented version of 4D-LRM.☆48Updated 2 months ago
- ObjCtrl-2.5D☆48Updated 4 months ago
- 4D Panoptic Scene Graph Generation (NeurIPS'23 Spotlight)☆111Updated 4 months ago
- WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes☆97Updated 4 months ago
- Code implementation for: From Virtual Games to Real-World Play☆37Updated last month
- LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS☆91Updated 3 weeks ago
- The official repository of "Sekai: A Video Dataset towards World Exploration"☆120Updated 3 weeks ago
- Improving 3D Large Language Model via Robust Instruction Tuning☆61Updated 5 months ago
- [ICLR 2025] Dataset and Code for Paper "Learning to Generate Diverse Pedestrian Movements from Web Videos with Noisy Labels"☆41Updated last month
- Official implementation of PartSTAD: 2D-to-3D Part Segmentation Task Adaptation (ECCV 2024).☆45Updated 9 months ago
- ☆30Updated 3 months ago
- Official repository for "Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation" (ICLR2025)☆71Updated 4 months ago
- Scaling Properties of Diffusion Models For Perceptual Tasks (CVPR 2025)☆41Updated 3 months ago
- [CVPR 2025] Open-World Amodal Appearance Completion☆31Updated 3 weeks ago
- The official implementation of The paper "Exploring the Potential of Encoder-free Architectures in 3D LMMs"☆55Updated 2 months ago
- Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"☆70Updated 2 weeks ago
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆39Updated 8 months ago
- [ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"☆22Updated last year
- [ICCV2023] "Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts" by Wenyan Cong, Hanxue Li…☆48Updated last year
- ☆21Updated 8 months ago
- Sora Generates Videos with Stunning Geometrical Consistency☆52Updated last year
- [ECCV 2024] Official Implementation of DragAPart: Learning a Part-Level Motion Prior for Articulated Objects.☆80Updated last year
- SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding☆54Updated last month
- "Comp4D: Compositional 4D Scene Generation", Dejia Xu*, Hanwen Liang*, Neel P. Bhatt, Hezhen Hu, Hanxue Liang, Konstantinos N. Platanioti…☆79Updated 11 months ago
- [CVPR 2024 Hightlight] Code release for "The More You See in 2D, the More You Perceive in 3D"☆63Updated 9 months ago