haoningwu3639 / SimpleSDM-3
A simple and flexible PyTorch implementation of StableDiffusion-3 based on diffusers for DIY and finetuning.
☆18Updated 2 months ago
Alternatives and similar repositories for SimpleSDM-3:
Users that are interested in SimpleSDM-3 are comparing it to the libraries listed below
- A simple and flexible PyTorch implementation of Video StableDiffusion (ZeroScope_v2) based on diffusers.☆17Updated last year
- A simple and flexible PyTorch implementation of StableDiffusion based on diffusers.☆22Updated 6 months ago
- [CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient☆92Updated 3 weeks ago
- This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality"☆46Updated this week
- CAR: Controllable AutoRegressive Modeling for Visual Generation☆107Updated 4 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆54Updated last week
- The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"☆32Updated last week
- A collection of diffusion models based on FLUX/DiT for image/video generation, editing, reconstruction, inpainting .etc.☆35Updated this week
- This is the official implementation for ControlVAR.☆101Updated 3 months ago
- PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiT☆62Updated last week
- Official Implementation of ICLR'24: Kosmos-G: Generating Images in Context with Multimodal Large Language Models☆68Updated 10 months ago
- ☆29Updated 2 weeks ago
- A simple and flexible PyTorch implementation of StableDiffusion-XL based on diffusers.☆16Updated 6 months ago
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis☆83Updated 8 months ago
- [CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".☆296Updated 3 weeks ago
- FQGAN: Factorized Visual Tokenization and Generation☆45Updated 2 months ago
- 🔥stable, simple, state-of-the-art VQVAE toolkit & cookbook☆87Updated 9 months ago
- ☆140Updated 2 months ago
- [ICLR2025] ClassDiffusion: Official impl. of Paper "ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance"☆40Updated 2 weeks ago
- [CVPR 2024] Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models☆246Updated 3 months ago
- Empowering Unified MLLM with Multi-granular Visual Generation☆119Updated 2 months ago
- ☆50Updated last week
- Implements VAR+CLIP for text-to-image (T2I) generation☆131Updated 2 months ago
- official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]☆85Updated last month
- [NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"☆53Updated 5 months ago
- [ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models☆68Updated 6 months ago
- Awesome autoregressive vision foundation models☆25Updated 3 months ago
- ☆20Updated last year
- [ICLR2025]☆140Updated 2 months ago
- Code for [CVPR 2025] ROICtrl: Boosting Instance Control for Visual Generation☆104Updated 2 weeks ago