Purshow / Awesome-Unified-MultimodalLinks

📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.

☆319

Alternatives and similar repositories for Awesome-Unified-Multimodal

Users that are interested in Awesome-Unified-Multimodal are comparing it to the libraries listed below

Sorting:

ByteVisionLab / TokenFlow
[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
☆393Updated 2 months ago
Wang-Xiaodong1899 / CVPR25-MLLM-Paper-List
🔥CVPR 2025 Multimodal Large Language Models Paper List
☆156Updated 7 months ago
cokeshao / Awesome-Multimodal-Token-Compression
Survey: https://arxiv.org/pdf/2507.20198
☆179Updated this week
TencentARC / TokLIP
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
☆226Updated 2 months ago
CodeGoat24 / UniGenBench
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation
☆97Updated this week
PKU-YuanGroup / WISE
WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation
☆157Updated 3 weeks ago
dvlab-research / VisionZip
Official repository for VisionZip (CVPR 2025)
☆363Updated 3 months ago
arctanxarc / UniCTokens
A framework for unified personalized model, achieving mutual enhancement between personalized understanding and generation. Demonstrating…
☆121Updated 3 weeks ago
showlab / Awesome-Unified-Multimodal-Models
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.
☆725Updated 2 weeks ago
WayneJin0918 / SOTA-paper-rating.io
A tiny paper rating web
☆39Updated 7 months ago
CodeGoat24 / UnifiedReward
Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think
☆574Updated this week
Video-R1 / Awesome-Multimodal-Reasoning
Collections of Papers and Projects for Multimodal Reasoning.
☆105Updated 6 months ago
AIDC-AI / Awesome-Unified-Multimodal-Models
Awesome Unified Multimodal Models
☆805Updated 2 months ago
lxa9867 / Awesome-Autoregressive-Visual-Generation
This is a repo to track the latest autoregressive visual generation papers.
☆405Updated 4 months ago
PhoenixZ810 / RISEBench
[NIPS 2025 DB Oral] Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
☆110Updated this week
arctanxarc / MC-LLaVA
Official implementation of MC-LLaVA.
☆140Updated 2 months ago
rongyaofang / GoT
Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"
☆291Updated 3 weeks ago
Osilly / Awesome-Interleaving-Reasoning
Interleaving Reasoning: Next-Generation Reasoning Systems for AGI
☆187Updated last week
Cooperx521 / PyramidDrop
(CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction
☆132Updated 7 months ago
ML-GSAI / Diffusion-LLM-Papers
A Collection of Papers on Diffusion Language Models
☆134Updated last month
mit-han-lab / vila-u
[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
☆398Updated 6 months ago
facebookresearch / metamorph
Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning
☆214Updated 6 months ago
yaolinli / TimeChat-Online
[ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
☆86Updated last month
zhyang2226 / OPA-DPO
[CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key
☆83Updated 3 weeks ago
HKUST-LongGroup / Awesome-MLLM-Benchmarks
☆143Updated 8 months ago
mrwu-mac / ControlMLLM
[NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'
☆194Updated 3 months ago
saccharomycetes / mllms_know
[ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'
☆278Updated 6 months ago
Haochen-Wang409 / ross
[ICLR'25] Reconstructive Visual Instruction Tuning
☆121Updated 6 months ago
ChaofanTao / Autoregressive-Models-in-Vision-Survey
[TMLR 2025🔥] A survey for the autoregressive models in vision.
☆725Updated this week
shxie2020 / Awesome-UGVFM
A collection of vision foundation models unifying understanding and generation.
☆57Updated 9 months ago