seervideodiffusion / SeerVideoLDM
[ICLR 2024] Seer: Language Instructed Video Prediction with Latent Diffusion Models
☆30Updated 10 months ago
Alternatives and similar repositories for SeerVideoLDM:
Users that are interested in SeerVideoLDM are comparing it to the libraries listed below
- [ECCV2024, Oral, Best Paper Finalist]This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation …☆37Updated last month
- ☆68Updated last month
- [CVPR 2024] On the Content Bias in Fréchet Video Distance☆105Updated 6 months ago
- Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics☆88Updated 2 weeks ago
- Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (arXiv 2025)☆24Updated last week
- Code release for NeurIPS 2023 paper SlotDiffusion: Object-centric Learning with Diffusion Models☆84Updated last year
- Frequency Autoregressive Image Generation with Continuous Tokens☆42Updated 3 weeks ago
- Official code for MotionBench (CVPR 2025)☆31Updated 3 weeks ago
- ☆33Updated last week
- ☆17Updated 5 months ago
- Comparison between Frechet Video Distance implementation from StyleGAN-V and the original paper☆102Updated 2 months ago
- [ICML 2024] Compositional Image Decomposition with Diffusion Models☆49Updated 8 months ago
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆66Updated last month
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆58Updated last month
- Code for paper "Grounding Video Models to Actions through Goal Conditioned Exploration".☆44Updated 3 months ago
- ElasticTok: Adaptive Tokenization for Image and Video☆64Updated 4 months ago
- [ICLR 2024] LLM-grounded Video Diffusion Models (LVD): official implementation for the LVD paper☆147Updated 10 months ago
- IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks☆58Updated 6 months ago
- official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation☆17Updated 2 weeks ago
- ☆30Updated last year
- ☆21Updated 10 months ago
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆63Updated last month
- [NeurIPS 2024] Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation☆61Updated 5 months ago
- Official GitHub repository for the Text-Guided Video Editing (TGVE) competition of LOVEU Workshop @ CVPR'23.☆75Updated last year
- [CVPR 2025] Open implementation of "RandAR"☆69Updated last week
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation☆98Updated 4 months ago
- ☆122Updated 2 months ago
- “FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching” FlowAR employs a simplest scale design and is compatible with an…☆97Updated 3 months ago
- ICCV2023-Diffusion-Papers☆109Updated last year
- Official implementation of the CVPR'24 paper [Adaptive Slot Attention: Object Discovery with Dynamic Slot Number]☆38Updated 2 months ago