hou-yz / dvgformerLinks
Code for our paper: Learning Camera Movement Control from Real-World Drone Videos
☆29Updated 2 months ago
Alternatives and similar repositories for dvgformer
Users that are interested in dvgformer are comparing it to the libraries listed below
Sorting:
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization☆21Updated 2 months ago
- Self-reimplemented version of 4D-LRM.☆30Updated 3 weeks ago
- Official implementation of EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance☆34Updated 3 weeks ago
- open-sourced video dataset with dynamic scenes and camera movements annotation☆61Updated 2 months ago
- [3DV 2025] Learning Naturally Aggregated Appearance for Efficient 3D Editing☆34Updated 4 months ago
- Repo for "Human-Centric Foundation Models: Perception, Generation and Agentic Modeling" (https://arxiv.org/abs/2502.08556)☆49Updated 4 months ago
- Sora Generates Videos with Stunning Geometrical Consistency☆50Updated last year
- FleVRS: Towards Flexible Visual Relationship Segmentation, NeurIPS 2024☆21Updated 6 months ago
- [ICLR 2025] Trajectory Attention For Fine-grained Video Motion Control☆81Updated last month
- Unofficial Implementation of "Stable Video Diffusion Multi-View"☆79Updated last year
- ☆38Updated 8 months ago
- VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation☆20Updated 3 months ago
- HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video☆67Updated last year
- ☆16Updated last year
- TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer☆47Updated last year
- "Comp4D: Compositional 4D Scene Generation", Dejia Xu*, Hanwen Liang*, Neel P. Bhatt, Hezhen Hu, Hanxue Liang, Konstantinos N. Platanioti…☆79Updated 10 months ago
- ☆48Updated last month
- [CVPR'25 - Rating 555] Official PyTorch implementation of Lumos: Learning Visual Generative Priors without Text☆51Updated 3 months ago
- ☆39Updated last year
- Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations☆48Updated this week
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆70Updated 4 months ago
- [ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"☆22Updated last year
- [ICLR' 25] AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation☆63Updated 3 months ago
- [ICCV 2025] Implementation of VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory☆63Updated this week
- [Neurips 2024] Video Diffusion Models are Training-free Motion Interpreter and Controller☆41Updated 2 months ago
- [ECCV 2024] HiFi-123: Towards High-fidelity One Image to 3D Content Generation☆66Updated 11 months ago
- Official PyTorch implementation of "A Unified Approach for Text- and Image-guided 4D Scene Generation", [CVPR 2024]☆83Updated last year
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆108Updated 7 months ago
- [ICLR 2025] Where Am I and What Will I See : An Auto-Regressive Model for Spatial Localization and View Prediction☆36Updated 4 months ago
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning☆70Updated last week