MME-Benchmarks/Video-MME-v2

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MME-Benchmarks/Video-MME-v2)

MME-Benchmarks / Video-MME-v2

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

☆369

Alternatives and similar repositories for Video-MME-v2

Users that are interested in Video-MME-v2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Northern-byte-bit / SpeechParaling-Bench
View on GitHub
☆30May 21, 2026Updated 2 months ago
yangruoliu / VideoDetective
View on GitHub
VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding
☆58May 1, 2026Updated 2 months ago
MiG-NJU / PersonaVLM
View on GitHub
[CVPR 2026 Highlight] PersonaVLM: Long-Term Personalized Multimodal LLMs
☆112Apr 16, 2026Updated 3 months ago
VITA-MLLM / Omni-Diffusion
View on GitHub
✨✨[ICML 2026] Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion
☆153Mar 12, 2026Updated 4 months ago
MME-Benchmarks / Video-MME
View on GitHub
✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
☆787Dec 8, 2025Updated 7 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Kwai-YuanQi / MM-RLHF
View on GitHub
The Next Step Forward in Multimodal LLM Alignment
☆198May 1, 2025Updated last year
VITA-MLLM / Long-VITA
View on GitHub
✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy
☆305May 14, 2025Updated last year
MiG-NJU / EvoEmbedding
View on GitHub
EvoEmbedding: Evolvable Representations for Long-Context Retrieval and Agentic Memory
☆52Updated this week
VITA-MLLM / Sparrow
View on GitHub
Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation
☆32Mar 28, 2025Updated last year
VisionXLab / GRADE
View on GitHub
[ECCV'26] GRADE: Grounded Reasoning Assessment for Discipline-informed Editing
☆29Apr 23, 2026Updated 3 months ago
VITA-MLLM / VITA-QinYu
View on GitHub
VITA-QINYU: Expressive Spoken Language Model for Role-Playing and Singing
☆121Jul 14, 2026Updated last week
MAC-AutoML / QuoTA
View on GitHub
✨✨[AAAI 2026] This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Vi…
☆79Apr 28, 2025Updated last year
Tencent / VITA
View on GitHub
The official implement of VITA, VITA15, LongVITA, VITA-Audio, VITA-VLA, and VITA-E.
☆162Oct 28, 2025Updated 8 months ago
MAC-AutoML / Awesome-Efficient-Large-Models
View on GitHub
A list of awesome papers on compression and acceleration of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs).
☆16May 12, 2026Updated 2 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Visionary-Laboratory / PhotoFlow
View on GitHub
PhotoFlow: Agentic 3D Virtual Photography Missions
☆38May 27, 2026Updated last month
VisionXLab / Moment-Video
View on GitHub
☆19Jun 2, 2026Updated last month
VisionXLab / EvoTok
View on GitHub
[ECCV'26] Code repo for "EvoTok: A Unified Image Tokenizer via Residual Latent Evolution for Visual Understanding and Generation"
☆22Jun 18, 2026Updated last month
VisionXLab / Rise-Video
View on GitHub
RISE-Video: Can Video Generators Decode Implicit World Rules?
☆28Mar 26, 2026Updated 3 months ago
MME-Benchmarks / MME-Unify
View on GitHub
✨✨ [ICLR 2026] MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
☆42Apr 10, 2025Updated last year
VITA-MLLM / Woodpecker
View on GitHub
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models
☆649Dec 23, 2024Updated last year
zhourax / VEGA
View on GitHub
☆38Jul 9, 2024Updated 2 years ago
Share14 / ShareGemini
View on GitHub
☆32Jul 29, 2024Updated last year
hwanyu112 / VIBE-Benchmark
View on GitHub
☆27Feb 3, 2026Updated 5 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
dongyh20 / Demo-ICL
View on GitHub
Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition
☆41Mar 3, 2026Updated 4 months ago
EvolvingLMMs-Lab / OneVision-Encoder
View on GitHub
Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence
☆386Jun 20, 2026Updated last month
BradyFU / DVG-Face
View on GitHub
[TPAMI 2021] DVG-Face: Dual Variational Generation for Heterogeneous Face Recognition
☆76Nov 13, 2023Updated 2 years ago
Visionary-Laboratory / SpaceDG
View on GitHub
SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation
☆31Jul 9, 2026Updated 2 weeks ago
xjtupanda / Sparrow
View on GitHub
Repo for paper "T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs"
☆48Sep 3, 2025Updated 10 months ago
synvo-ai / HippoCamp
View on GitHub
A benchmark for evaluating contextual agents on realistic multimodal personal-computer environments with profiling and factual-retention …
☆29Apr 2, 2026Updated 3 months ago
ChoS3nE11ven / Agentic-MME
View on GitHub
☆36Apr 13, 2026Updated 3 months ago
OpenEvaluation / VLMEvalKit
View on GitHub
☆23Apr 11, 2026Updated 3 months ago
VITA-MLLM / VITA
View on GitHub
✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
☆2,521Mar 28, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
FrankYang-17 / RealUnify
View on GitHub
☆27Oct 10, 2025Updated 9 months ago
TencentARC / Video-Holmes
View on GitHub
[ECCV 2026] Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?
☆95Jul 13, 2025Updated last year
yfzhang114 / Thyme
View on GitHub
✨✨ [ICLR 2026] Think Beyond Images
☆583Sep 23, 2025Updated 10 months ago
Visionary-Laboratory / CourtSI
View on GitHub
Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports
☆71Mar 15, 2026Updated 4 months ago
VisionXLab / FIRM-Reward
View on GitHub
Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation
☆40Mar 13, 2026Updated 4 months ago
aurateam2026 / AURA
View on GitHub
☆116Jun 5, 2026Updated last month
MAC-AutoML / SpecEyes
View on GitHub
[ECCV 2026🔥] This is the official implementation of our paper "SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception…
☆62Apr 2, 2026Updated 3 months ago