NVlabs / SanaLinks
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
☆4,264Updated 2 weeks ago
Alternatives and similar repositories for Sana
Users that are interested in Sana are comparing it to the libraries listed below
Sorting:
- MAGI-1: Autoregressive Video Generation at Scale☆3,284Updated this week
- Official repository for LTX-Video☆6,696Updated 3 weeks ago
- ☆2,214Updated this week
- Official implementations for paper: VACE: All-in-One Video Creation and Editing☆2,648Updated last month
- 📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion☆2,170Updated 3 months ago
- [ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling☆2,960Updated 6 months ago
- Open-source unified multimodal model☆4,204Updated this week
- OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340☆4,144Updated this week
- SkyReels V1: The first and most advanced open-source human-centric video foundation model☆2,207Updated 3 months ago
- ☆3,025Updated 3 months ago
- HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo☆1,507Updated last month
- LTX-Video Support for ComfyUI☆2,068Updated last month
- ☆1,201Updated 5 months ago
- HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation☆1,072Updated last week
- A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gem…☆1,426Updated this week
- The best OSS video generation models☆3,219Updated 5 months ago
- [NeurIPS 2024] Official code for PuLID: Pure and Lightning ID Customization via Contrastive Alignment☆3,387Updated last month
- A general fine-tuning kit geared toward diffusion models.☆2,377Updated this week
- SkyReels-V2: Infinite-length Film Generative model☆3,090Updated 3 weeks ago
- A minimal and universal controller for FLUX.1.☆1,639Updated 2 weeks ago
- Lumina-T2X is a unified framework for Text to Any Modality Generation☆2,198Updated 4 months ago
- Official repository of In-Context LoRA for Diffusion Transformers☆1,913Updated 6 months ago
- PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation☆1,816Updated 7 months ago
- [CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis☆1,608Updated last month
- 🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning☆1,118Updated 2 months ago
- Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment☆1,196Updated 2 weeks ago
- ACE-Step: A Step Towards Music Generation Foundation Model☆2,459Updated 2 weeks ago
- FastVideo is a unified framework for accelerated video generation.☆1,538Updated this week
- PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis☆3,108Updated 7 months ago
- [ARXIV'25] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video☆1,260Updated 3 weeks ago