Vchitect / VBenchLinks
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
☆1,091Updated this week
Alternatives and similar repositories for VBench
Users that are interested in VBench are comparing it to the libraries listed below
Sorting:
- [CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis☆1,363Updated 2 weeks ago
- A reading list of video generation☆600Updated this week
- [CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers☆608Updated 8 months ago
- Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"☆455Updated 10 months ago
- Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA☆1,588Updated 9 months ago
- Stable Video Diffusion Training Code and Extensions.☆703Updated 11 months ago
- A collection of awesome video generation studies.☆572Updated 3 weeks ago
- [ICLR 2025] Autoregressive Video Generation without Vector Quantization☆544Updated last month
- Let's finetune video generation models!☆486Updated 2 months ago
- Scalable and memory-optimized training of diffusion models☆1,207Updated last month
- ☆360Updated 8 months ago
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆605Updated 3 months ago
- Multimodal Models in Real World☆520Updated 4 months ago
- Official implementation of FIFO-Diffusion: Generating Infinite Videos from Text without Training (NeurIPS 2024)☆462Updated 8 months ago
- SEED-Voken: A Series of Powerful Visual Tokenizers☆911Updated 2 weeks ago
- ☆433Updated this week
- [TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.☆1,848Updated 3 months ago
- VideoSys: An easy and efficient system for video generation☆1,986Updated 4 months ago
- [IJCV 2024] LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models☆934Updated 8 months ago
- Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis