zeyofu / Commonsense-T2IView external linksLinks
Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]
☆24Aug 13, 2024Updated last year
Alternatives and similar repositories for Commonsense-T2I
Users that are interested in Commonsense-T2I are comparing it to the libraries listed below
Sorting:
- Streaming Video Diffusion: Online Video Editing with Diffusion Models☆18Jun 3, 2024Updated last year
- Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (ICML 2025)☆53May 8, 2025Updated 9 months ago
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement"☆52Dec 5, 2024Updated last year
- The official code of "PixelWorld: Towards Perceiving Everything as Pixels" [TMLR25]☆16Sep 12, 2025Updated 5 months ago
- Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation☆33Jun 30, 2025Updated 7 months ago
- [ECCV2024]The official implementation of the DiffPNG paper in PyTorch.☆15Oct 17, 2024Updated last year
- [TACL] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- ☆22May 11, 2025Updated 9 months ago
- [ICCV 2025] Prompt-A-Video☆20Feb 2, 2025Updated last year
- This repo contains the official PyTorch implementation of vLMIG: Improving Visual Commonsense in Language Models via Multiple Image Gener…☆17Jul 1, 2024Updated last year
- ☆16Jun 14, 2024Updated last year
- [ICCV 2025] The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation☆21Oct 12, 2025Updated 4 months ago
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…☆159Sep 27, 2025Updated 4 months ago
- [CVPR 2025] Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis☆130May 16, 2025Updated 8 months ago
- ☆15Sep 18, 2023Updated 2 years ago
- Exploring Representation-Aligned Latent Space for Better Generation☆17Feb 4, 2025Updated last year
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆19Jun 27, 2024Updated last year
- ☆16Oct 21, 2024Updated last year
- ☆73Jan 27, 2025Updated last year
- [NeurIPS 2024] RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models☆119Nov 14, 2024Updated last year
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆45Nov 29, 2023Updated 2 years ago
- Visualize attention maps in Diffusion Models☆22Mar 10, 2025Updated 11 months ago
- ☆17Oct 1, 2024Updated last year
- Official PyTorch implementation of "Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis…☆46Nov 2, 2023Updated 2 years ago
- TraDiffusion: Trajectory-Based Training-Free Image Generation☆54Nov 10, 2024Updated last year
- ☆17Apr 11, 2022Updated 3 years ago
- Visual and Embodied Concepts evaluation benchmark☆21Oct 10, 2023Updated 2 years ago
- The official PyTorch implementation for Improving Long-Text Alignment for Text-to-Image Diffusion Models (LongAlign)☆80Apr 23, 2025Updated 9 months ago
- [Neurips 2024] Video Diffusion Models are Training-free Motion Interpreter and Controller☆50Aug 5, 2025Updated 6 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆89Sep 26, 2024Updated last year
- [ECCV’24] Official repository for "BEAF: Observing Before-AFter Changes to Evaluate Hallucination in Vision-language Models"☆21Mar 26, 2025Updated 10 months ago
- The official repo for "OpenMoE 2: Sparse Diffusion Language Models".☆52Dec 28, 2025Updated last month
- [ICML 2024] Compositional Image Decomposition with Diffusion Models☆53Jul 7, 2024Updated last year
- Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation☆23Jul 30, 2025Updated 6 months ago
- Evaluation codes and data for GenEval2☆55Jan 8, 2026Updated last month
- ☆27Jun 4, 2024Updated last year
- [ICIP 2025] Scribble-Guided Diffusion for Training-free Text-to-Image Generation☆24Oct 2, 2024Updated last year
- ☆50Oct 29, 2023Updated 2 years ago
- Extend BoxDiff to SDXL (SDXL-based layout-to-image generation)☆26May 23, 2024Updated last year