mindspore-lab / minddiffusion
A collection of diffusion models based on MindSpore
☆159Updated last year
Alternatives and similar repositories for minddiffusion:
Users that are interested in minddiffusion are comparing it to the libraries listed below
- one for all, Optimal generator with No Exception☆406Updated this week
- DeepSpeed教程 & 示例注释 & 学习笔记 (大模型高效训练)☆155Updated last year
- TaiSu(太素)--a large-scale Chinese multimodal dataset(亿级大规模中文视觉语言预训练数据集)☆180Updated last year
- My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"☆225Updated 2 months ago
- finetune stable diffusion with Dreambooth、LoRA、ControlNet☆55Updated last year
- pytorch单精度、半精度、混合精度、单卡、多卡(DP / DDP)、FSDP、DeepSpeed模型训练代码,并对比不同方法的训练速度以及GPU内存的使用☆94Updated last year
- The official implementation of "Relay Diffusion: Unifying diffusion process across resolutions for image synthesis" [ICLR 2024 Spotlight]☆296Updated 11 months ago
- ☆159Updated this week
- MindFace is an open source toolkit based on MindSpore, containing the most advanced face recognition and detection models, such as ArcFa…☆46Updated last month
- 多模态 MM +Chat 合集☆249Updated last month
- Models and examples built with OneFlow☆97Updated 5 months ago
- A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".☆989Updated 2 years ago
- LaVIT: Empower the Large Language Model to Understand and Generate Visual Content☆573Updated 5 months ago
- ☆103Updated last year
- diffusion-based layout-to-image generation model☆290Updated last year
- [MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models☆286Updated last month
- 生成扩散模型的Keras实现☆276Updated last month
- VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks☆385Updated 8 months ago
- ☆108Updated last year
- Lossless Training Speed Up by Unbiased Dynamic Data Pruning☆331Updated 6 months ago
- ☆67Updated last year
- Reading notes about Multimodal Large Language Models, Large Language Models, and Diffusion Models☆347Updated 2 weeks ago
- The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision M…☆496Updated last year
- Research Code for Multimodal-Cognition Team in Ant Group☆139Updated 8 months ago
- [ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxi…☆232Updated 10 months ago
- ☆45Updated last year
- MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer☆220Updated 11 months ago
- PyTorch implementation of RCG https://arxiv.org/abs/2312.03701☆909Updated 6 months ago
- A collection of awesome text-to-image generation studies.☆568Updated 3 weeks ago
- Text-To-Image Generation with Chinese Characters☆129Updated last year