Shark0-0 / VG4D
Implementation of the paper: VG4D: Vision-Language Model Goes 4D Video Recognition(ICRA 2024)
☆14Updated 11 months ago
Alternatives and similar repositories for VG4D:
Users that are interested in VG4D are comparing it to the libraries listed below
- A framework named B^2-DiffuRL for RL-based diffusion model fine-tuning.☆19Updated last week
- [ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"☆22Updated 10 months ago
- [NeurIPS2023] Implementation of the paper: Explore In-Context Learning for 3D Point Cloud Understanding☆67Updated 4 months ago
- Official implementation of PARIS3D (Accepted to ECCV 2024).☆21Updated 6 months ago
- Official implementation of "Reangle-A-Video: 4D Video Generation as Video-to-Video Translation"☆33Updated 2 weeks ago
- This is the project page of ShowRoom3D☆25Updated last year
- Pytorch implementation of GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting☆69Updated last month
- ☆32Updated last week
- ☆17Updated last week
- Open-Vocabulary SAM3D: Understand Any 3D Scene☆27Updated 6 months ago
- Repo for "Human-Centric Foundation Models: Perception, Generation and Agentic Modeling" (https://arxiv.org/abs/2502.08556)☆36Updated last month
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆37Updated 3 months ago
- The official implementation of The paper "Exploring the Potential of Encoder-free Architectures in 3D LMMs"☆50Updated last month
- ☆19Updated 11 months ago
- 4D Panoptic Scene Graph Generation (NeurIPS'23 Spotlight)☆105Updated 2 weeks ago
- [ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models …☆44Updated 2 weeks ago
- ☆21Updated 3 months ago
- [ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.☆60Updated 5 months ago
- Paper: UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting☆14Updated last month
- Open-world 3D part segmentation of point clouds☆71Updated last week
- "Comp4D: Compositional 4D Scene Generation", Dejia Xu*, Hanwen Liang*, Neel P. Bhatt, Hezhen Hu, Hanxue Liang, Konstantinos N. Platanioti…☆77Updated 7 months ago
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆65Updated last month
- ☆31Updated 9 months ago
- GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data☆61Updated 3 months ago
- ☆34Updated 11 months ago
- [ICLR' 25] AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation☆60Updated last week
- Official implementation of “4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models” (CVPR 2025)☆76Updated 2 weeks ago
- Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model☆49Updated 2 months ago
- DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors☆37Updated 6 months ago
- WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes☆69Updated last week