RERV / VDT
[ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding.
☆223Updated 8 months ago
Alternatives and similar repositories for VDT:
Users that are interested in VDT are comparing it to the libraries listed below
- [CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models☆150Updated 3 months ago
- [NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models☆257Updated last month
- 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".☆222Updated 2 weeks ago
- MoVQGAN - model for the image encoding and reconstruction☆212Updated last year
- Scaling Diffusion Transformers with Mixture of Experts☆242Updated 4 months ago
- You can easily calculate FVD, PSNR, SSIM, LPIPS for evaluating the quality of generated or predicted videos.☆274Updated last week
- Unofficial PyTorch implementation of the VideoLDM.☆151Updated last year
- NOVA: Autoregressive Video Generation without Vector Quantization☆314Updated this week
- [CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers☆557Updated 2 months ago
- ☆221Updated 6 months ago
- [ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model☆393Updated 2 months ago
- Official PyTorch implementation of Video Probabilistic Diffusion Models in Projected Latent Space (CVPR 2023).☆314Updated 8 months ago
- Implementation of MagViT2 Tokenizer in Pytorch☆588Updated this week
- Implements VAR+CLIP for text-to-image (T2I) generation☆112Updated 2 weeks ago
- Comparison between Frechet Video Distance implementation from StyleGAN-V and the original paper☆93Updated last week
- Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)☆545Updated 8 months ago
- [CVPR 2024] | LAMP: Learn a Motion Pattern for Few-Shot Based Video Generation☆273Updated 8 months ago
- [CVPR 2024] Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models☆227Updated last month
- This repo contains the code for 1D tokenizer and generator☆645Updated this week
- The official implementation of "Relay Diffusion: Unifying diffusion process across resolutions for image synthesis" [ICLR 2024 Spotlight]☆282Updated 8 months ago
- [Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation☆227Updated this week
- [NeurIPS 2024] The official code of "U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers"☆176Updated 3 months ago
- Scalable Diffusion Models with State Space Backbone☆149Updated 10 months ago
- [CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"☆185Updated 9 months ago
- LaVIT: Empower the Large Language Model to Understand and Generate Visual Content☆550Updated 3 months ago
- Official pytorch implementation of the paper: "An Edit Friendly DDPM Noise Space: Inversion and Manipulations". CVPR 2024.☆305Updated 6 months ago
- Official code of SmartEdit [CVPR-2024 Highlight]☆276Updated 6 months ago
- Code for "Diffusion Model Alignment Using Direct Preference Optimization"☆308Updated this week
- Code for Fast Training of Diffusion Models with Masked Transformers☆385Updated 8 months ago
- ☆253Updated 2 weeks ago