This is the first released survey paper on hallucinations of large vision-language models (LVLMs). To keep track of this field and continuously update our survey, we maintain this repository of relevant references.
β93Jul 26, 2024Updated last year
Alternatives and similar repositories for LVLM-Hallucinations-Survey
Users that are interested in LVLM-Hallucinations-Survey are comparing it to the libraries listed below
Sorting:
- π A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).β994Sep 27, 2025Updated 5 months ago
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''β253Aug 21, 2025Updated 7 months ago
- An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluationβ160Jan 15, 2024Updated 2 years ago
- β15Oct 15, 2023Updated 2 years ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)β57Oct 28, 2024Updated last year
- An automatic MLLM hallucination detection frameworkβ19Sep 26, 2023Updated 2 years ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Modelsβ155Apr 30, 2024Updated last year
- HallE-Control: Controlling Object Hallucination in LMMsβ31Apr 10, 2024Updated last year
- β28Apr 18, 2025Updated 11 months ago
- [ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigationβ137Sep 11, 2025Updated 6 months ago
- β55Apr 1, 2024Updated last year
- [CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decodingβ387Oct 7, 2024Updated last year
- Code for paper: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Modelsβ56Dec 18, 2024Updated last year
- Official pytorch implementation of "RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in Large Vision Languageβ¦β14Dec 16, 2024Updated last year
- Context-Aware Multi-View Summarization Network for Image-Text Matching. ACM MM'20β29May 26, 2022Updated 3 years ago
- β20Oct 21, 2022Updated 3 years ago
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"β22Dec 8, 2024Updated last year
- [ACL 2025 Findings] Official pytorch implementation of "Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Visβ¦β25Jul 21, 2024Updated last year
- [AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal Pβ¦β65Jan 27, 2026Updated last month
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Modelsβ87Oct 26, 2025Updated 4 months ago
- LLM hallucination paper listβ331Mar 11, 2024Updated 2 years ago
- A Novel Approach for Effective Multi-View Clustering with Information-Theoretic Perspective is a paper accepted by NeurIPS 2023β10May 15, 2024Updated last year
- [ACM Multimedia 2025] This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visualβ¦β83Feb 22, 2025Updated last year
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attentionβ63Jul 16, 2024Updated last year
- π curated list of awesome LMM hallucinations papers, methods & resources.β150Mar 23, 2024Updated last year
- The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?β42Nov 1, 2024Updated last year
- (NeurIPS 2025) Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"β47Jun 3, 2025Updated 9 months ago
- up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resourcesβ284Feb 8, 2026Updated last month
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answeringβ16Oct 31, 2024Updated last year
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimizationβ100Jan 30, 2024Updated 2 years ago
- [ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"β101Nov 30, 2025Updated 3 months ago
- mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigatingβ97Jan 29, 2024Updated 2 years ago
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Promptβ¦β45Dec 20, 2024Updated last year
- The implementation of paper "EliMRec: Eliminating single-modal bias in multimedia recommendation", MM'22.β22Dec 7, 2023Updated 2 years ago
- β13Feb 1, 2022Updated 4 years ago
- β13Jun 11, 2024Updated last year
- MUltiple SUV Thresholding (MUST)-segmenter is a semi-automated PET image segmentation tool that enables delineation of multiple lesions aβ¦β12May 14, 2025Updated 10 months ago
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)β84Jul 1, 2024Updated last year
- β102Dec 22, 2023Updated 2 years ago