JeffWang987 / EgoVid
EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation
☆59Updated last week
Related projects ⓘ
Alternatives and complementary repositories for EgoVid
- [ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.☆57Updated last month
- [3DV 2025] Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model☆44Updated 5 months ago
- Unofficial Implementation of "Stable Video Diffusion Multi-View"☆73Updated 7 months ago
- [ECCV 2024] EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion.☆62Updated 5 months ago
- Sora Generates Videos with Stunning Geometrical Consistency☆47Updated 7 months ago
- [NeurIPS 2024] Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding☆48Updated 3 weeks ago
- [NeurIPS 2024] Official code repository for MSR3D paper☆21Updated this week
- A collection of object-compositional modeling by implicit neural representation.☆56Updated last year
- This is the project page of ShowRoom3D☆25Updated 10 months ago
- Gaga: Group Any Gaussians via 3D-aware Memory Bank☆78Updated last month
- [NeurIPS 2023 Spotlight] Code for "Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion"☆61Updated last year
- Where Am I and What Will I See : An Auto-Regressive Model for Spatial Localization and View Prediction☆21Updated 3 weeks ago
- [CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning☆25Updated 2 weeks ago
- [NeurIPS2023] Implementation of the paper: Explore In-Context Learning for 3D Point Cloud Understanding☆64Updated last week
- Official implementation of the paper "Unifying 3D Vision-Language Understanding via Promptable Queries"☆50Updated 3 months ago
- Scaling Properties of Diffusion Models For Perceptual Tasks☆23Updated last week
- ☆38Updated 11 months ago
- 4D Panoptic Scene Graph Generation (NeurIPS'23 Spotlight)☆86Updated 6 months ago
- Code for "Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes"☆51Updated 7 months ago
- [ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities☆53Updated last month
- [WACV'25] Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding☆44Updated 7 months ago
- Official code for ICCV2023 paper: Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis☆28Updated 10 months ago
- [NeurIPS 2023] Weakly Supervised 3D Open-vocabulary Segmentation☆109Updated 10 months ago
- [CVPR 2024] 🏡Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning☆69Updated 7 months ago
- ☆32Updated 7 months ago
- DynamicCity: Large-Scale LiDAR Generation from Dynamic Scenes☆47Updated 3 weeks ago
- ConsistentNeRF Enhances Neural Radiance Fields with 3D Consistency for Sparse View Synthesis☆72Updated last year
- [CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding☆67Updated 3 weeks ago
- [ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"☆17Updated 6 months ago
- Code&Data for Grounded 3D-LLM with Referent Tokens☆89Updated last month