maxin-cn / LatteLinks
The official implementation of Latte: Latent Diffusion Transformer for Video Generation.
☆35Updated 11 months ago
Alternatives and similar repositories for Latte
Users that are interested in Latte are comparing it to the libraries listed below
Sorting:
- [AAAI 2025] Official pytorch implementation of "VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion …☆162Updated last year
- Code for our ICCV 2025 paper "Adaptive Caching for Faster Video Generation with Diffusion Transformers"☆165Updated last year
- ☆114Updated last year
- The HD-VG-130M Dataset☆120Updated last year
- [NeurIPS 2024] VideoTetris: Towards Compositional Text-To-Video Generation☆231Updated last year
- STAR: Scale-wise Text-to-image generation via Auto-Regressive representations☆148Updated 11 months ago
- Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"☆313Updated last year
- [NeurIPS 2024] VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models☆173Updated last year
- Code repository for T2V-Turbo and T2V-Turbo-v2☆310Updated last year
- [NeurIPS 2025] Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".☆213Updated 4 months ago
- ☆109Updated last year
- ☆180Updated 2 months ago
- Open source implementation and models of One-step Diffusion with Distribution Matching Distillation☆180Updated last year
- [NeurIPS 2024] RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models☆119Updated last year
- [ICLR 2025] IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation☆203Updated 11 months ago
- ☆68Updated last year
- [CVPR 2025] Official PyTorch implementation of StoryGPT-V☆40Updated 7 months ago
- A light-weight and high-efficient training framework for accelerating diffusion tasks.☆51Updated last year
- ☆213Updated 11 months ago
- Video-Infinity generates long videos quickly using multiple GPUs without extra training.☆191Updated last year
- Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think!☆120Updated 10 months ago
- ☆360Updated last year
- Repo for Qwen Image Finetune☆195Updated last month
- InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions☆132Updated last year
- UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing☆116Updated 9 months ago
- An Efficient Text-to-Image Generation Pretrain Pipeline☆129Updated 9 months ago
- [IJCV 2025] Paragraph-to-Image Generation with Information-Enriched Diffusion Model☆106Updated 10 months ago
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation☆85Updated last year
- MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)☆146Updated last year
- [ICLR 2025] Official Implementation of Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image…☆341Updated 3 weeks ago