Vchitect/Uni-MMMU

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Vchitect/Uni-MMMU)

Vchitect / Uni-MMMU

[ACL2026 oral] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark

☆25

Alternatives and similar repositories for Uni-MMMU

Users that are interested in Uni-MMMU are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zhengdian1 / AIA
View on GitHub
☆45Jan 4, 2026Updated 6 months ago
cheryyunl / ROVER
View on GitHub
Official eval code for ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation
☆26Dec 12, 2025Updated 7 months ago
FrankYang-17 / RealUnify
View on GitHub
☆27Oct 10, 2025Updated 9 months ago
wangf3014 / VTok
View on GitHub
Official implementation of VTok: A Unified Video Tokenizer with Decoupled Spatial-Temporal Latents
☆15Feb 5, 2026Updated 5 months ago
penghao-wu / ProxyV
View on GitHub
[ICML 2025] Streamline Without Sacrifice - Squeeze out Computation Redundancy in LMM
☆20May 22, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
QC-LY / UiG
View on GitHub
Code for "Understanding-in-Generation:Reinforcing Generative Capability of Unified Model via Infusing Understanding into Generation"
☆15Nov 11, 2025Updated 8 months ago
Eyeline-Labs / VChain
View on GitHub
[ACL 2026 Findings, ICCV 2025 Workshop Outstanding Paper Award] VChain: Chain-of-Visual-Thought for Reasoning in Video Generation
☆120Apr 8, 2026Updated 3 months ago
iSEE-Laboratory / PanoDecouple
View on GitHub
(CVPR2025 Highlight) Official repository of paper "Panorama Generation From NFoV Image Done Right"
☆19May 29, 2025Updated last year
thuml / Reasoning-Visual-World
View on GitHub
Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…
☆100Mar 9, 2026Updated 4 months ago
multimodal-reasoning-lab / Bagel-Zebra-CoT
View on GitHub
https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT
☆137Jan 30, 2026Updated 5 months ago
sen-ye / R3
View on GitHub
[ICLR26] Understanding VS. Generation: Navigating Optimization Dilemma in Multimodal Models
☆25May 6, 2026Updated 2 months ago
iSEE-Laboratory / DiffuVolume
View on GitHub
(IJCV2025) The official implementation of "DiffuVolume: Diffusion Model for Volume based Stereo Matching"
☆30Jan 15, 2025Updated last year
PKU-YuanGroup / UniSandBox
View on GitHub
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward
☆60Nov 27, 2025Updated 7 months ago
shulin16 / MMInA
View on GitHub
[ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents
☆54Feb 27, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
yongliu20 / Awesome-Unified-Understanding-and-Generation
View on GitHub
☆52Aug 22, 2025Updated 11 months ago
csuhan / Tar
View on GitHub
[NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
☆202Sep 18, 2025Updated 10 months ago
Vchitect / ShotBench
View on GitHub
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models
☆102Sep 12, 2025Updated 10 months ago
WayneJin0918 / SRUM
View on GitHub
[ECCV 2026🔥] SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models
☆93Nov 26, 2025Updated 7 months ago
dongyh20 / Demo-ICL
View on GitHub
Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition
☆40Mar 3, 2026Updated 4 months ago
LAW1223 / OpenSubject
View on GitHub
☆55Dec 10, 2025Updated 7 months ago
PhoenixZ810 / RISEBench
View on GitHub
[NIPS 2025 DB Oral] Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
☆155May 18, 2026Updated 2 months ago
appletea233 / EditThinker
View on GitHub
Unlocking Iterative Reasoning for Any Image Editor
☆111Jan 18, 2026Updated 6 months ago
EvolvingLMMs-Lab / sae
View on GitHub
A framework that allows you to apply Sparse AutoEncoder on any models
☆53Jul 11, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
kolmogorovArnoldFourierNetwork / kaf_act
View on GitHub
PyTorch implementation of a learnable activation function combining base activation and Random Fourier Features (RFF). This package provi…
☆13Feb 2, 2025Updated last year
sherwinbahmani / threed_front_rendering
View on GitHub
☆13Sep 2, 2023Updated 2 years ago
MME-Benchmarks / MME-Unify
View on GitHub
✨✨ [ICLR 2026] MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
☆42Apr 10, 2025Updated last year
Vchitect / RealDPO
View on GitHub
☆32Dec 17, 2025Updated 7 months ago
IVUL-KAUST / VideoAuto-R1
View on GitHub
[CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
☆88Feb 27, 2026Updated 4 months ago
iSEE-Laboratory / EgoExo-Fitness
View on GitHub
(ECCV 2024) Official repository of paper "EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding"
☆38Apr 8, 2025Updated last year
penghao-wu / visual_jigsaw
View on GitHub
☆78Apr 9, 2026Updated 3 months ago
Visual-AI / Pancap
View on GitHub
[NeurIPS 2025] Panoptic Captioning: An Equivalence Bridge for Image and Text
☆38Jan 31, 2026Updated 5 months ago
showlab / FQGAN
View on GitHub
FQGAN: Factorized Visual Tokenization and Generation
☆59Mar 29, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
renyu2002 / SJTU_SE_CG
View on GitHub
上海交通大学软件学院本科计算机图形学课程代码仓库
☆14Oct 3, 2025Updated 9 months ago
Luodian / GenBench
View on GitHub
Benchmarking and Analyzing Generative Data for Visual Recognition
☆26Jul 25, 2023Updated 2 years ago
jiaming-zhou / X-ICM
View on GitHub
official repo for AGNOSTOS, a cross-task manipulation benchmark, and X-ICM method, a cross-task in-context manipulation (VLA) method
☆69May 28, 2026Updated last month
Owen718 / LongPrompt-LLamaGen
View on GitHub
This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…
☆30Oct 21, 2024Updated last year
PKU-YuanGroup / UAE
View on GitHub
Official repository for the UAE paper, unified-GRPO, and unified-Bench
☆165Sep 12, 2025Updated 10 months ago
arctanxarc / GENIUS
View on GitHub
☆42May 9, 2026Updated 2 months ago
1ranGuan / VST
View on GitHub
[ECCV 26] Video Streaming Thinking
☆115Jun 18, 2026Updated last month