llyx97 / video_reason_benchLinks

A benchmark for evaluating vision-centric, complex video reasoning.

☆34

Alternatives and similar repositories for video_reason_bench

Users that are interested in video_reason_bench are comparing it to the libraries listed below

Sorting:

Liuziyu77 / MIA-DPO
Official implement of MIA-DPO
☆67Updated 9 months ago
Liuziyu77 / MMDU
Official repository of MMDU dataset
☆97Updated last year
NUS-TRAIL / NoisyRollout
[NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
☆95Updated last month
appletea233 / Temporal-R1
Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency
☆57Updated 5 months ago
TIGER-AI-Lab / VL-Rethinker
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆164Updated 5 months ago
yhy-2000 / VideoDeepResearch
☆119Updated 3 weeks ago
yale-nlp / MMVU
Data and Code for CVPR 2025 paper "MMVU: Measuring Expert-Level Multi-Discipline Video Understanding"
☆75Updated 8 months ago
MME-Benchmarks / MME-CoT
MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency
☆133Updated 3 months ago
OpenGVLab / V2PE
[ArXiv] V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding
☆57Updated 11 months ago
ModalMinds / MM-PRM
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision
☆25Updated 5 months ago
longvideobench / LongVideoBench
[Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.
☆112Updated last year
Cooperx521 / ScaleCap
Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’
☆57Updated 4 months ago
TencentARC / Video-Holmes
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?
☆76Updated 4 months ago
joez17 / VideoNIAH
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
☆50Updated 8 months ago
njucckevin / MM-Self-Improve
A Self-Training Framework for Vision-Language Reasoning
☆86Updated 9 months ago
RupertLuo / VoCoT
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
☆75Updated last year
GaryStack / MMR-V
Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"?
☆36Updated 4 months ago
xinyan-cxy / MINT-CoT
[NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning
☆87Updated last month
RifleZhang / LLaVA-Reasoner-DPO
☆99Updated 10 months ago
PKU-YuanGroup / Look-Back
This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".
☆69Updated 4 months ago
Kwai-YuanQi / MM-RLHF
The Next Step Forward in Multimodal LLM Alignment
☆185Updated 6 months ago
Wild-Cooperation-Hub / Awesome-MLLM-Reasoning-Benchmarks
A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.
☆69Updated 7 months ago
TIGER-AI-Lab / Pixel-Reasoner
Pixel-Level Reasoning Model trained with RL [NeuIPS25]
☆248Updated last week
pipilurj / bootstrapped-preference-optimization-BPO
code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"
☆59Updated last year
Hui-design / Open-LLaVA-Video-R1
[LLaVA-Video-R1]✨First Adaptation of R1 to LLaVA-Video (2025-03-18)
☆35Updated 6 months ago
open-compass / Creation-MMBench
Assessing Context-Aware Creative Intelligence in MLLMs
☆23Updated 3 months ago
MikeWangWZHL / PAPO
Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"
☆95Updated 2 months ago
yu-rp / VisualPerceptionToken
☆125Updated 7 months ago
EvolvingLMMs-Lab / VideoMMMU
☆61Updated 2 months ago
yonseivnl / vlm-rlaif
ACL'24 (Oral) Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback
☆76Updated last year