Qrange-group / SUR-adapter
ACM MM'23 (oral), SUR-adapter for pre-trained diffusion models can acquire the powerful semantic understanding and reasoning capabilities from large language models to build a high-quality textual semantic representation for text-to-image generation.
☆111Updated 4 months ago
Related projects: ⓘ
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis☆75Updated 2 months ago
- ☆92Updated 2 months ago
- [CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models☆118Updated 2 weeks ago
- RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models☆103Updated 3 months ago
- [Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation☆190Updated 3 weeks ago
- Official code for 💫CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching☆122Updated 4 months ago
- ☆89Updated 9 months ago
- The HD-VG-130M Dataset☆106Updated 5 months ago
- STAR: Scale-wise Text-to-image generation via Auto-Regressive representations☆107Updated 3 months ago
- Official implementation of the paper "Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synth…☆88Updated 11 months ago
- ☆30Updated 2 months ago
- MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance☆84Updated last month
- Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation☆38Updated 9 months ago
- Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step☆133Updated 2 months ago
- Pytorch Implementation of "SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation"(CVPR 2024)☆87Updated last month
- [ECCV2024] StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion☆34Updated 2 months ago
- Official code of SmartEdit [CVPR-2024 Highlight]☆227Updated 2 months ago
- (CVPR 2024) 🧩 TokenCompose: Text-to-Image Diffusion with Token-level Supervision☆107Updated 2 months ago
- [CVPR`2024, Oral] Attention Calibration for Disentangled Text-to-Image Personalization☆77Updated 5 months ago
- Official code for 'Paragraph-to-Image Generation with Information-Enriched Diffusion Model'☆93Updated 4 months ago
- [NeurIPS 2023] Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT☆129Updated 4 months ago
- [CVPR 2024] Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models☆196Updated this week
- Implementation of InstructEdit☆66Updated 10 months ago
- ReCo: Region-Controlled Text-to-Image Generation, CVPR 2023☆114Updated 10 months ago
- GenEval: An object-focused framework for evaluating text-to-image alignment☆85Updated last month
- ☆93Updated 2 months ago
- An in-context conditioning version of MUSE with pre-trained checkpoints.☆105Updated last year
- Official GitHub repository for the Text-Guided Video Editing (TGVE) competition of LOVEU Workshop @ CVPR'23.☆68Updated 10 months ago
- [CVPR 2024] Official PyTorch implementation of FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Composition☆89Updated 3 weeks ago
- ☆89Updated 4 months ago