wmchen / vis_diffusion_attention
Visualize attention maps in Diffusion Models
☆16Updated 2 weeks ago
Alternatives and similar repositories for vis_diffusion_attention:
Users that are interested in vis_diffusion_attention are comparing it to the libraries listed below
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆54Updated last week
- World Simulator Assistant for Physics-Aware Text-to-Video Generation☆9Updated 2 weeks ago
- [ECCV 2024] Official repository of ECCV 2024 paper: Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion M…☆14Updated 2 months ago
- This is the official implementation for ControlVAR.☆101Updated 3 months ago
- A collection of diffusion models based on FLUX/DiT for image/video generation, editing, reconstruction, inpainting .etc.☆35Updated this week
- Official Implementation of VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention☆33Updated this week
- A collection of vision foundation models unifying understanding and generation.☆47Updated 2 months ago
- Official Implementation of ICLR'24: Kosmos-G: Generating Images in Context with Multimodal Large Language Models☆68Updated 10 months ago
- ☆20Updated last year
- official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation☆17Updated last week
- CAR: Controllable AutoRegressive Modeling for Visual Generation☆107Updated 4 months ago
- T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation☆72Updated this week
- Implements VAR+CLIP for text-to-image (T2I) generation☆131Updated 2 months ago
- [CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".☆296Updated 3 weeks ago
- [CVPR 2025] Open implementation of "RandAR"☆69Updated last week
- [CVPR 2024] InitNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization☆56Updated 9 months ago
- [CVPR 2024] Official implementation of CVPR 2024 paper: "Doubly Abductive Counterfactual Inference for Text-based Image Editing"☆23Updated last year
- ☆56Updated last week
- PyTorch implementation of InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following☆30Updated 2 months ago
- [ECCV2024]The official implementation of the DiffPNG paper in PyTorch.☆11Updated 5 months ago
- The code and data of Paper: Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation☆95Updated 5 months ago
- The collection of awesome papers on alignment of diffusion models.☆149Updated 3 weeks ago
- Replication in Visual Diffusion Models: A Survey and Outlook☆28Updated 7 months ago
- Official code for CVPR 2024 paper: Discriminative Probing and Tuning for Text-to-Image Generation☆30Updated 3 months ago
- [NeurIPS 2024] COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing☆23Updated 3 months ago
- PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiT☆62Updated last week
- Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.☆70Updated 3 weeks ago
- This is a repo to track the latest autoregressive visual generation papers.☆178Updated this week
- Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics☆88Updated 2 weeks ago
- [ICLR2025] The code of Z-Sampling, proposed in our paper "Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflectio…☆62Updated last month