Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]
☆24Aug 13, 2024Updated last year
Alternatives and similar repositories for Commonsense-T2I
Users that are interested in Commonsense-T2I are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Streaming Video Diffusion: Online Video Editing with Diffusion Models☆18Jun 3, 2024Updated last year
- Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (ICML 2025)☆55May 8, 2025Updated 11 months ago
- [CVPR 2025] Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis☆133May 16, 2025Updated 11 months ago
- [ECCV2024]The official implementation of the DiffPNG paper in PyTorch.☆17Oct 17, 2024Updated last year
- Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation☆35Jun 30, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆74Jan 27, 2025Updated last year
- The official code of "PixelWorld: Towards Perceiving Everything as Pixels" [TMLR25]☆16Sep 12, 2025Updated 7 months ago
- This is the official repository for the paper "FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehe…☆128Jan 29, 2026Updated 2 months ago
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement [ACL 2026 Findings]"☆53Apr 7, 2026Updated last week
- [CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesis☆62Mar 31, 2026Updated 2 weeks ago
- ☆13Jan 22, 2025Updated last year
- [ICCV 2025] Prompt-A-Video☆23Feb 2, 2025Updated last year
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆45Nov 29, 2023Updated 2 years ago
- [Neurips 2024] Video Diffusion Models are Training-free Motion Interpreter and Controller☆50Aug 5, 2025Updated 8 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- The official github repo for MixEval-X, the first any-to-any, real-world benchmark.☆17Feb 15, 2025Updated last year
- Evaluation codes and data for GenEval2☆66Jan 8, 2026Updated 3 months ago
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…☆164Sep 27, 2025Updated 6 months ago
- A Massive Multi-Discipline Lecture Understanding Benchmark☆34Nov 1, 2025Updated 5 months ago
- ☆26Jun 20, 2024Updated last year
- ☆24Mar 16, 2026Updated last month
- Code for FreeTraj, a tuning-free method for trajectory-controllable video generation☆112Sep 19, 2025Updated 6 months ago
- ☆15Sep 18, 2023Updated 2 years ago
- ☆10Oct 27, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [NeurIPS 2024] RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models☆121Nov 14, 2024Updated last year
- [ICML 2024] Compositional Image Decomposition with Diffusion Models☆54Jul 7, 2024Updated last year
- Extend BoxDiff to SDXL (SDXL-based layout-to-image generation)☆27May 23, 2024Updated last year
- Motion-conditional image animation for video editing☆20Dec 2, 2023Updated 2 years ago
- Exploring Representation-Aligned Latent Space for Better Generation☆19Mar 17, 2026Updated last month
- [ NeurIPS 2024 D&B Track ] Implementation for "FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models"☆73Dec 27, 2024Updated last year
- ☆16Dec 6, 2014Updated 11 years ago
- [TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study☆16Nov 22, 2024Updated last year
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆19Jun 27, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆16Jun 14, 2024Updated last year
- ☆37Nov 8, 2024Updated last year
- PDM-based Purifier☆22Nov 5, 2024Updated last year
- Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation☆23Jul 30, 2025Updated 8 months ago
- Factored-NeuS: Reconstructing Surfaces, Illumination, and Materials of Possibly Glossy Objects (CVPR 2025)☆25Apr 9, 2025Updated last year
- a unified reinforcement learning toolbox for joint RL on language models and diffusion models☆79Mar 31, 2026Updated 2 weeks ago
- Official PyTorch implementation of "Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis…☆46Nov 2, 2023Updated 2 years ago