blairstar / The_Art_of_DPM
An In-depth Analysis of Diffusion Probability Model
☆104Updated 2 months ago
Related projects: ⓘ
- Keras implement of Finite Scalar Quantization☆58Updated 10 months ago
- My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"☆168Updated last week
- ☆92Updated 2 months ago
- A list for Text-to-Video, Image-to-Video works☆167Updated last month
- TaiSu(太素)--a large-scale Chinese multimodal dataset(亿级大规模中文视觉语言预训练数据集)☆172Updated 10 months ago
- pytorch单精度、半精度、混合精度、单卡、多卡(DP / DDP)、FSDP、DeepSpeed模型训练代码,并对比不同方法的训练速度以及GPU内存的使用☆69Updated 6 months ago
- Text-To-Image Generation with Chinese Characters☆116Updated last year
- GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models☆39Updated 2 months ago
- Chinese CLIP models with SOTA performance.☆44Updated last year
- The official implementation of Latte: Latent Diffusion Transformer for Video Generation.☆32Updated 6 months ago
- VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models☆93Updated last month
- [ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxi…☆205Updated 4 months ago
- ☆63Updated last year
- CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP☆130Updated 2 years ago
- Adapted from https://note.com/kohya_ss/n/nbf7ce8d80f29 for easier cloning☆23Updated 10 months ago
- ☆93Updated last year
- ☆23Updated last year
- VideoTetris: Towards Compositional Text-To-Video Generation☆197Updated 2 weeks ago
- ☆16Updated last year
- ACM MM'23 (oral), SUR-adapter for pre-trained diffusion models can acquire the powerful semantic understanding and reasoning capabilities…☆111Updated 4 months ago
- ☆34Updated 3 months ago
- The official implementation of "Relay Diffusion: Unifying diffusion process across resolutions for image synthesis" [ICLR 2024 Spotlight]☆260Updated 4 months ago
- The HD-VG-130M Dataset☆106Updated 5 months ago
- minisora-DiT, a DiT reproduction based on XTuner from the open source community MiniSora☆36Updated 5 months ago
- official code for paper: Exploring Domain Incremental Video Highlights Detection with the LiveFood Benchmark☆26Updated 8 months ago
- Scaling Diffusion Transformers with Mixture of Experts☆178Updated last week
- Official PyTorch implementation of the “Spatial-Semantic Collaborative Cropping for User Generated Content”. (AAAI24)☆33Updated 5 months ago
- STAR: Scale-wise Text-to-image generation via Auto-Regressive representations☆107Updated 3 months ago
- Latent-based SR using MoE and frequency augmented VAE decoder☆145Updated 9 months ago
- Video dataset dedicated to portrait-mode video recognition.☆35Updated 5 months ago