This is the first released survey paper on hallucinations of large vision-language models (LVLMs). To keep track of this field and continuously update our survey, we maintain this repository of relevant references.
β96Jul 26, 2024Updated last year
Alternatives and similar repositories for LVLM-Hallucinations-Survey
Users that are interested in LVLM-Hallucinations-Survey are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).β1,026Sep 27, 2025Updated 9 months ago
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''β263Aug 21, 2025Updated 10 months ago
- An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluationβ169Jan 15, 2024Updated 2 years ago
- β16Oct 15, 2023Updated 2 years ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)β57Oct 28, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- An automatic MLLM hallucination detection frameworkβ19Sep 26, 2023Updated 2 years ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Modelsβ158Apr 30, 2024Updated 2 years ago
- HallE-Control: Controlling Object Hallucination in LMMsβ32Apr 10, 2024Updated 2 years ago
- β33Apr 18, 2025Updated last year
- Github repository for "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas" (ICML 2025)β76May 2, 2025Updated last year
- [CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decodingβ409Oct 7, 2024Updated last year
- Code for paper: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Modelsβ59Dec 18, 2024Updated last year
- Context-Aware Multi-View Summarization Network for Image-Text Matching. ACM MM'20β29May 26, 2022Updated 4 years ago
- β20Oct 21, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"β22Dec 8, 2024Updated last year
- [ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Modelsβ20Jul 17, 2024Updated last year
- [ACL 2025 Findings] Official pytorch implementation of "Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Visβ¦β25Jul 21, 2024Updated last year
- Dynamic Modality Interaction Modeling for Image-Text Retrieval. SIGIR'21β70Apr 5, 2026Updated 2 months ago
- [AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal Pβ¦β67Jan 27, 2026Updated 5 months ago
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Modelsβ87Oct 26, 2025Updated 8 months ago
- LLM hallucination paper listβ336Mar 11, 2024Updated 2 years ago
- A Novel Approach for Effective Multi-View Clustering with Information-Theoretic Perspective is a paper accepted by NeurIPS 2023β10May 15, 2024Updated 2 years ago
- [ACM Multimedia 2025] This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visualβ¦β83Feb 22, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer β’ AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attentionβ68Jul 16, 2024Updated last year
- π curated list of awesome LMM hallucinations papers, methods & resources.β150Mar 23, 2024Updated 2 years ago
- (NeurIPS 2025) Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"β51Jun 3, 2025Updated last year
- The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?β43Nov 1, 2024Updated last year
- up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resourcesβ323Feb 8, 2026Updated 4 months ago
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answeringβ18Oct 31, 2024Updated last year
- [ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"β105Nov 30, 2025Updated 7 months ago
- mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigatingβ100Jan 29, 2024Updated 2 years ago
- The implementation of paper "EliMRec: Eliminating single-modal bias in multimedia recommendation", MM'22.β22Dec 7, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Promptβ¦β45Dec 20, 2024Updated last year
- β13Feb 1, 2022Updated 4 years ago
- β14Jun 11, 2024Updated 2 years ago
- MUltiple SUV Thresholding (MUST)-segmenter is a semi-automated PET image segmentation tool that enables delineation of multiple lesions aβ¦β12Mar 18, 2026Updated 3 months ago
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)β89Jul 1, 2024Updated last year
- β102Dec 22, 2023Updated 2 years ago
- [NeurIPS2023] LoRA: A Logical Reasoning Augmented Dataset for Visual Question Answeringβ12Jan 5, 2024Updated 2 years ago