CSU-JPG / Awesome-VLM-ReasoningLinks

☆20

Alternatives and similar repositories for Awesome-VLM-Reasoning

Users that are interested in Awesome-VLM-Reasoning are comparing it to the libraries listed below

Sorting:

The-Martyr / Awesome-Multimodal-Reasoning
Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models
☆43Updated 3 weeks ago
ligeng0197 / Awesome-Thinking-With-Images
Latest open-source "Thinking with images" (O3/O4-mini) papers, covering training-free, SFT-based, and RL-enhanced methods for "fine-grain…
☆97Updated 3 months ago
RUCAIBox / POPE
The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
☆226Updated 3 months ago
lhanchao777 / LVLM-Hallucinations-Survey
This is the first released survey paper on hallucinations of large vision-language models (LVLMs). To keep track of this field and contin…
☆87Updated last year
FanqingM / MM-Eureka-V0
MM-Eureka V0 also called R1-Multimodal-Journey, Latest version is in MM-Eureka
☆320Updated 5 months ago
zhyang2226 / OPA-DPO
[CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key
☆86Updated last month
shikiw / Awesome-MLLM-Hallucination
Papers about Hallucination in Multi-Modal Large Language Models (MLLMs)
☆98Updated last year
deepcs233 / Visual-CoT
[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought …
☆396Updated 10 months ago
Wang-Xiaodong1899 / CVPR25-MLLM-Paper-List
🔥CVPR 2025 Multimodal Large Language Models Paper List
☆155Updated 8 months ago
yongliang-wu / ExploreCfg
[NeurIPS2023] Exploring Diverse In-Context Configurations for Image Captioning
☆42Updated 11 months ago
BillChan226 / HALC
[ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"
☆101Updated 11 months ago
X-PLUG / mPLUG-HalOwl
mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating
☆98Updated last year
Wild-Cooperation-Hub / Awesome-MLLM-Reasoning-Benchmarks
A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.
☆69Updated 8 months ago
NishilBalar / Awesome-LVLM-Hallucination
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
☆208Updated last month
HKUST-LongGroup / Awesome-MLLM-Benchmarks
☆149Updated 9 months ago
jungao1106 / ICoT
[CVPR' 25] Interleaved-Modal Chain-of-Thought
☆92Updated 2 weeks ago
TideDra / VL-RLHF
A RLHF Infrastructure for Vision-Language Models
☆187Updated last year
RupertLuo / VoCoT
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
☆75Updated last year
SooLab / DDCOT
[NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models
☆48Updated last year
NiuTrans / Vision-LLM-Alignment
This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vi…
☆116Updated 5 months ago
Video-R1 / Awesome-Multimodal-Reasoning
Collections of Papers and Projects for Multimodal Reasoning.
☆105Updated 6 months ago
DAMO-NLP-SG / VCD
[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
☆339Updated last year
ustc-hyin / ClearSight
Code for paper: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Models
☆36Updated 11 months ago
FuxiaoLiu / LRV-Instruction
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
☆290Updated last year
shikiw / OPERA
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allo…
☆380Updated last year
opendatalab / HA-DPO
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
☆98Updated last year
Wang-Xiaodong1899 / Open-R1-Video
✨First Open-Source R1-like Video-LLM [2025/02/18]
☆373Updated 8 months ago
pkunlp-icler / MIC
MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU
☆50Updated 4 months ago
tianyi-lab / HallusionBench
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(…
☆312Updated last month
MME-Benchmarks / MME-CoT
MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiency
☆133Updated 3 months ago