yczhou001 / LongBench-T2ILinks
Draw ALL Your Imagine: A Holistic Benchmark and Agent Framework for Complex Instruction-based Image Generation
☆23Updated 4 months ago
Alternatives and similar repositories for LongBench-T2I
Users that are interested in LongBench-T2I are comparing it to the libraries listed below
Sorting:
- A Collection of Papers on Diffusion Language Models☆154Updated 4 months ago
- Doodling our way to AGI ✏️ 🖼️ 🧠☆120Updated 8 months ago
- Code for the paper "AsFT: Anchoring Safety During LLM Fune-Tuning Within Narrow Safety Basin".☆35Updated 6 months ago
- The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"☆140Updated 3 weeks ago
- This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".☆83Updated 6 months ago
- [NeurIPS'25] HoliTom: Holistic Token Merging for Fast Video Large Language Models☆70Updated 3 months ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆36Updated 9 months ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆104Updated 4 months ago
- ☆59Updated 5 months ago
- [EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…☆103Updated last year
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆114Updated 6 months ago
- Co-Reinforcement Learning for Unified Multimodal Understanding and Generation☆37Updated 6 months ago
- [NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient☆64Updated 4 months ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆235Updated 5 months ago
- Official Repository: A Comprehensive Benchmark for Logical Reasoning in MLLMs☆45Updated 7 months ago
- Official implement of MIA-DPO☆70Updated last year
- [ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models☆94Updated last year
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆88Updated 4 months ago
- Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"?☆37Updated 7 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆179Updated 2 months ago
- ✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints☆79Updated 6 months ago
- Official implementation of "Diffusion Language Models Know the Answer Before Decoding"☆43Updated 4 months ago
- ☆37Updated 5 months ago
- The code repository of UniRL☆51Updated 8 months ago
- [NIPS 2025 DB Oral] Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing☆139Updated last month
- Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"☆110Updated last month
- Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning☆232Updated last week
- [ICCV 2025] p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay☆43Updated 7 months ago
- (ICLR 2025 Spotlight) Official code repository for Interleaved Scene Graph.☆31Updated 5 months ago
- [CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models☆97Updated 2 months ago