Video-R1/Awesome-Multimodal-Reasoning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Video-R1/Awesome-Multimodal-Reasoning)

Video-R1 / Awesome-Multimodal-Reasoning

Collections of Papers and Projects for Multimodal Reasoning.

☆108

Alternatives and similar repositories for Awesome-Multimodal-Reasoning

Users that are interested in Awesome-Multimodal-Reasoning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MCG-NJU / CaReBench
View on GitHub
A Fine-grained Benchmark for Video Captioning and Retrieval
☆30Jul 16, 2025Updated last year
MCG-NJU / p-MoD
View on GitHub
[ICCV 2025] p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay
☆44Jun 26, 2025Updated last year
MCG-NJU / VideoEval
View on GitHub
VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model
☆15Jul 31, 2025Updated 11 months ago
Wild-Cooperation-Hub / Awesome-MLLM-Reasoning-Benchmarks
View on GitHub
A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.
☆76Mar 18, 2025Updated last year
Wang-Xiaodong1899 / Open-R1-Video
View on GitHub
✨First Open-Source R1-like Video-LLM [2025/02/18]
☆382Jul 1, 2026Updated 3 weeks ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
zhengyuan-xie / ECCV24_NeST
View on GitHub
[ECCV 2024] Early Preparation Pays Off: New Classifier Pre-tuning for Class Incremental Semantic Segmentation
☆39Mar 3, 2025Updated last year
Sun-Haoyuan23 / Awesome-RL-based-Reasoning-MLLMs
View on GitHub
This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-bas…
☆1,435May 11, 2026Updated 2 months ago
TuringEyeTest / TuringEyeTest
View on GitHub
Pixels, Patterns, but no Poetry: To See the World like Humans
☆18Aug 11, 2025Updated 11 months ago
FishAndWasabi / Real-LOD
View on GitHub
Offical implementation of "Re-Aligning Language to Visual Objects with an Agentic Workflow"
☆34Apr 20, 2025Updated last year
HVision-NKU / MaskCLIPpp
View on GitHub
Official repository of the paper "High-Quality Mask Tuning Matters for Open-Vocabulary Segmentation"
☆47Mar 25, 2025Updated last year
HJYao00 / Awesome-Reasoning-MLLM
View on GitHub
Awesome Reasoning in MLLMs: Papers and Projects about learning to reason with MLLMs, including Chain-of-Thought (CoT), OpenAl o1, and Dee…
☆63Mar 18, 2025Updated last year
lyhisme / DeST
View on GitHub
An official code for "A Decoupled Spatio-Temporal Framework for Skeleton-based Action Segmentation".
☆39Dec 15, 2023Updated 2 years ago
ModalMinds / MM-EUREKA
View on GitHub
MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning
☆770Sep 7, 2025Updated 10 months ago
MCG-NJU / FreeRet
View on GitHub
[ICML2026] FreeRet: MLLMs as Training-Free Retrievers
☆22May 25, 2026Updated 2 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
egoschema / EgoSchema
View on GitHub
☆117Dec 30, 2024Updated last year
OpenGVLab / VideoChat-R1
View on GitHub
[NIPS2025] VideoChat-R1 & R1.5: Enhancing Spatio-Temporal Perception and Reasoning via Reinforcement Fine-Tuning
☆268Oct 18, 2025Updated 9 months ago
MCG-NJU / VideoChat-Online
View on GitHub
[CVPR 2025] Online Video Understanding: OVBench and VideoChat-Online
☆97Oct 7, 2025Updated 9 months ago
zhaochen0110 / Awesome_Think_With_Images
View on GitHub
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…
☆1,493Mar 9, 2026Updated 4 months ago
TideDra / lmm-r1
View on GitHub
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
☆848May 14, 2025Updated last year
HC-Guo / Awesome-Multimodal-Chain-of-Thought
View on GitHub
Collection of papers and repos for multimodal chain-of-thought
☆89Nov 6, 2024Updated last year
MCG-NJU / TimeLens2
View on GitHub
TimeLens2: Generalist Video Temporal Grounding with Multimodal LLMs
☆44Updated this week
HVision-NKU / TAR3D
View on GitHub
Official Code for 'TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction' (ICCV 2025)
☆77Nov 8, 2025Updated 8 months ago
ZX-Yin / DreamLifting
View on GitHub
The code implementation for the paper "DreamLifting: A Plug-in Module Lifting MV Diffusion Models for 3D Asset Generation".
☆30Sep 1, 2025Updated 10 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
HVision-NKU / ControlSR
View on GitHub
☆13Apr 19, 2025Updated last year
lizhou-cs / mglmm
View on GitHub
☆32Jun 14, 2026Updated last month
MCG-NJU / Video-DC
View on GitHub
☆12Jul 30, 2025Updated 11 months ago
zhangchbin / awesome-continual-segmentation
View on GitHub
This repo is a collection of AWESOME things about continual semantic segmentation, including papers, code, demos, etc. Feel free to pull …
☆30Aug 21, 2024Updated last year
Youggls / ACROSS-ACL23
View on GitHub
Official code repo for paper: ACROSS: An Alignment-based Framework for Low-Resource Many-to-One Cross-Lingual Summarization
☆12Jul 15, 2023Updated 3 years ago
lwpyh / Awesome-MLLM-Reasoning-Collection
View on GitHub
A collection of multimodal reasoning papers, codes, datasets, benchmarks and resources.
☆36Jul 1, 2026Updated 3 weeks ago
jungao1106 / ICoT
View on GitHub
[CVPR' 25] Interleaved-Modal Chain-of-Thought
☆112Dec 30, 2025Updated 6 months ago
YuxiangChai / OpenSlides
View on GitHub
AI-powered slide workspace for creating, editing, versioning, and presenting beautiful reveal.js decks from prompts and source files.
☆15Apr 14, 2026Updated 3 months ago
MCG-NKU / ExperiCV
View on GitHub
Initial code for computer vision experiments
☆11Jan 1, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Ethanhuhuhu / KAC
View on GitHub
☆21Jul 22, 2025Updated last year
StarsfieldAI / R1-V
View on GitHub
Witness the aha moment of VLM with less than $3.
☆4,065May 19, 2025Updated last year
HVision-NKU / Cascade-CLIP
View on GitHub
Official implement of ICML2024 Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation
☆58Aug 15, 2024Updated last year
yunfanLu / Self-EvRSVFI
View on GitHub
[IEEE TVCG 2025] Self-supervised Learning of Event-guided Video Frame Interpolation for Rolling Shutter Frames
☆11Jun 1, 2025Updated last year
FreedomIntelligence / TRIM
View on GitHub
We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…
☆22Jan 11, 2026Updated 6 months ago
Fancy-MLLM / R1-Onevision
View on GitHub
R1-onevision, a visual language model capable of deep CoT reasoning.
☆581Apr 13, 2025Updated last year
Osilly / Awesome-Interleaving-Reasoning
View on GitHub
Interleaving Reasoning: Next-Generation Reasoning Systems for AGI
☆280Jun 5, 2026Updated last month