researchmm / MM-Diffusion
[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
☆382Updated 3 months ago
Related projects: ⓘ
- Official PyTorch implementation of Video Probabilistic Diffusion Models in Projected Latent Space (CVPR 2023).☆295Updated 4 months ago
- [ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxi…☆205Updated 4 months ago
- Official PyTorch implementation of TATS: A Long Video Generation Framework with Time-Agnostic VQGAN and Time-Sensitive Transformer (ECCV …☆263Updated 4 months ago
- Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)☆504Updated 4 months ago
- You can easily calculate FVD, PSNR, SSIM, LPIPS for evaluating the quality of generated or predicted videos.☆199Updated 3 months ago
- [ICLR2023] Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation (CDCD).☆155Updated last year
- [CVPR 2024] Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners☆113Updated 2 months ago
- Official Pytorch Implementation of Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models☆186Updated last year
- [ICCV 2023] Official PyTorch implementation for the paper "FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model"☆260Updated 11 months ago
- [CVPR 2024] Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models☆196Updated last week
- Official pytorch implementation of the paper: "An Edit Friendly DDPM Noise Space: Inversion and Manipulations". CVPR 2024.☆254Updated 2 months ago
- ☆432Updated 2 years ago
- [CVPR2023] A faster, smaller, and better text-to-image model for large-scale training☆225Updated 8 months ago
- Official implementation of MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation (https://arxiv.org/abs/…☆322Updated last year
- Implementation of MagViT2 Tokenizer in Pytorch☆537Updated last month
- Official Pytorch Implementation of Our CVPR2023 Paper: "Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dyna…☆146Updated last year
- A reading list of video generation☆362Updated this week
- Unofficial PyTorch implementation of the VideoLDM.☆144Updated last year
- This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation☆394Updated last week
- The pytorch implementation of our CVPR 2023 paper "Conditional Image-to-Video Generation with Latent Flow Diffusion Models"☆440Updated 3 months ago
- [CVPR 2024] | LAMP: Learn a Motion Pattern for Few-Shot Based Video Generation☆252Updated 4 months ago
- LVDM: Latent Video Diffusion Models for High-Fidelity Long Video Generation☆440Updated 10 months ago
- [ICLR2024] Official repo for paper "PnP Inversion: Boosting Diffusion-based Editing with 3 Lines of Code"☆232Updated 6 months ago
- 🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).☆293Updated 3 weeks ago
- The official implementation of DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis☆147Updated 2 months ago
- Official code of SmartEdit [CVPR-2024 Highlight]☆227Updated 3 months ago
- (CVPR 2024) 🧩 TokenCompose: Text-to-Image Diffusion with Token-level Supervision☆107Updated 2 months ago
- [CVPR2024 Highlight] VBench - We Evaluate Video Generation☆490Updated 2 weeks ago
- [ICCV 2023] Online Clustered Codebook☆133Updated 9 months ago
- Comparison between Frechet Video Distance implementation from StyleGAN-V and the original paper☆77Updated last year