zeyofu / Commonsense-T2ILinks
Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]
☆22Updated 11 months ago
Alternatives and similar repositories for Commonsense-T2I
Users that are interested in Commonsense-T2I are comparing it to the libraries listed below
Sorting:
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?☆60Updated this week
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆76Updated 4 months ago
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning