Ruiyang-061X / Awesome-MLLM-ReasoningLinks

📖Curated list about reasoning abilitiy of MLLM, including OpenAI o1, OpenAI o3-mini, and Slow-Thinking.

☆12

Alternatives and similar repositories for Awesome-MLLM-Reasoning

Users that are interested in Awesome-MLLM-Reasoning are comparing it to the libraries listed below

Sorting:

Ruiyang-061X / Uncertainty-o
✨ Official code for our paper: "Uncertainty-o: One Model-agnostic Framework for Unveiling Epistemic Uncertainty in Large Multimodal Model…
☆17Updated 7 months ago
zjunlp / Deco
[ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
☆118Updated last month
minglllli / CLS-RL
[NeurIPS 2025 Spotlight] Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning
☆71Updated last month
1zhou-Wang / MemVR
[ICML 2025] Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in…
☆164Updated last month
RupertLuo / VoCoT
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
☆75Updated last year
The-Martyr / CausalMM
[ICLR 2025] Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality
☆50Updated 4 months ago
ADaM-BJTU / Mind_with_eyes_Awesome_MLLMs_Reasoning
This repository will continuously update the latest papers, technical reports, benchmarks about multimodal reasoning!
☆54Updated 7 months ago
yfzhang114 / LLaVA-Align
[ACM Multimedia 2025] This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual…
☆82Updated 8 months ago
Dongping-Chen / MLLM-Judge
[ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.
☆86Updated 8 months ago
jungao1106 / ICoT
[CVPR' 25] Interleaved-Modal Chain-of-Thought
☆90Updated this week
UCSC-VLAA / VLAA-Thinking
[TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
☆139Updated last month
CRIPAC-DIG / LogicCheckGPT
[ACL 2024] Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models. Detect and mitigate object hallucinatio…
☆24Updated 9 months ago
ys-zong / VL-ICL
[ICLR 2025] VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning
☆65Updated last month
SooLab / DDCOT
[NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models
☆46Updated last year
yuezih / less-is-more
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)
☆54Updated last year
Sreyan88 / VDGD
Code for ICLR 2025 Paper: Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs
☆21Updated 6 months ago
MME-Benchmarks / MME-CoT
MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency
☆133Updated 3 months ago
X-PLUG / mPLUG-HalOwl
mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating
☆98Updated last year
opendatalab / HA-DPO
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
☆96Updated last year
Ruiyang-061X / VL-Uncertainty
🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".
☆45Updated 7 months ago
maifoundations / Visionary-R1
Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning
☆41Updated 4 months ago
pritamqu / HALVA
[ICLR 2025] Data-Augmented Phrase-Level Alignment for Mitigating Object Hallucination
☆16Updated 9 months ago
Ghy0501 / HiDe-LLaVA
[ACL'25 Main] Official Implementation of HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Languag…
☆36Updated 2 months ago
chancharikmitra / CCoT
[CVPR 2024] Official Code for the Paper "Compositional Chain-of-Thought Prompting for Large Multimodal Models"
☆140Updated last year
gyhdog99 / MoCLE
MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)
☆44Updated 4 months ago
MLRM-Halu / MLRM-Halu
[NeurIPS 2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models
☆66Updated 5 months ago
shengliu66 / VTI
Code for Reducing Hallucinations in Vision-Language Models via Latent Space Steering
☆86Updated 11 months ago
luka-group / mDPO
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆83Updated last year
swordlidev / Evaluation-Multimodal-LLMs-Survey
A Survey on Benchmarks of Multimodal Large Language Models
☆143Updated 4 months ago
pkunlp-icler / MIC
MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU
☆50Updated 3 months ago