[CVPR25 Highlight] A ChatGPT-Prompted Visual hallucination Evaluation Dataset, featuring over 100,000 data samples and four advanced evaluation modes. The dataset includes extensive contextual descriptions, counterintuitive images, and clear indicators of hallucination items.
β31Apr 16, 2025Updated 11 months ago
Alternatives and similar repositories for PhD
Users that are interested in PhD are comparing it to the libraries listed below
Sorting:
- π curated list of awesome LMM hallucinations papers, methods & resources.β150Mar 23, 2024Updated last year
- Source code for EMNLP2022 paper "Finding Skill Neurons in Pre-trained Transformers via Prompt Tuning".β18Mar 13, 2023Updated 3 years ago
- [ICLR '25] Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations"β100Nov 30, 2025Updated 3 months ago
- This repo contains the code for the paper "Understanding and Mitigating Hallucinations in Large Vision-Language Models via Modular Attribβ¦β35Jul 14, 2025Updated 8 months ago
- STI-Bench : Are MLLMs Ready for Precise Spatial-Temporal World Understanding?β38Jan 12, 2026Updated 2 months ago
- [CVPR 2024] TeachCLIP for Text-to-Video Retrievalβ42May 7, 2025Updated 10 months ago
- VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Modelsβ77Jul 13, 2024Updated last year
- Code for Reducing Hallucinations in Vision-Language Models via Latent Space Steeringβ106Nov 23, 2024Updated last year
- [CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Keyβ106Jan 9, 2026Updated 2 months ago
- XL-VLMs: General Repository for eXplainable Large Vision Language Modelsβ46Sep 8, 2025Updated 6 months ago
- [CVPR 2025] Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attβ¦β68Oct 9, 2025Updated 5 months ago
- β10Oct 21, 2024Updated last year
- This is the dataset for the competition "Clinical Brain Computer Interfaces Challenge" to be held at WCCI 2020 at Glasgow. There are the β¦β10Jan 20, 2022Updated 4 years ago
- β11May 16, 2025Updated 10 months ago
- [ICCV 2025] VisRL: Intention-Driven Visual Perception via Reinforced Reasoningβ46Nov 8, 2025Updated 4 months ago
- [EMNLP'25] A novel alignment framework that leverages image retrieval to mitigate hallucinations in Vision Language Models.β50Aug 21, 2025Updated 6 months ago
- β¨β¨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audioβ52Jul 11, 2025Updated 8 months ago
- β13Jun 5, 2023Updated 2 years ago
- Official Code for Teacher Assistant-Based Knowledge Distillation Extracting Multi-level Features on Single Channel Sleep EEG (IJCAI 2023)β11Nov 4, 2023Updated 2 years ago
- β10Jan 19, 2022Updated 4 years ago
- [ICLR 2026] Empowering Small VLMs to Think with Dynamic Memorization and Explorationβ15Nov 18, 2025Updated 3 months ago
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMsβ24Sep 21, 2025Updated 5 months ago
- β13Jul 22, 2022Updated 3 years ago
- β18Aug 7, 2025Updated 7 months ago
- In OLHWDB ,you can find the ptts files, this code can help you get the information of the pttsβ11Mar 8, 2022Updated 4 years ago
- Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactualsβ12May 24, 2024Updated last year
- Pytorch implementation of Detectiveβ12Jul 11, 2024Updated last year
- πOfficial code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".β50Mar 18, 2025Updated 11 months ago
- Repositiory of paper "Continual Learning for LiDAR Semantic Segmentation: Class-Incremental and Coarse-to-Fine strategies on Sparse Data"β14Oct 26, 2024Updated last year
- β11Aug 29, 2022Updated 3 years ago
- Code of the Grounded MUIE model, REAMOβ11Dec 3, 2024Updated last year
- Continual Online Recalibration with Pseudo-labelsβ12Jun 20, 2024Updated last year
- The official code and data for paper "VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI"β16Mar 25, 2025Updated 11 months ago
- This is an example download script to download CT-RATEβ18Apr 5, 2024Updated last year
- Code for Learned Thresholds Token Merging and Pruning for Vision Transformers (LTMP). A technique to reduce the size of Vision Transformeβ¦β17Nov 24, 2024Updated last year
- The official implementation of the paper SAEdit: Token-level control for continuous image editing via Sparse AutoEncoderβ19Oct 19, 2025Updated 4 months ago
- [NAACL 2025π₯] MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inferenceβ18Jun 19, 2025Updated 8 months ago
- Code and data for the paper "Steering Conversational Large Language Models for Long Emotional Support Conversations" along with a UI to vβ¦β15Apr 14, 2025Updated 11 months ago
- An undergraduate thesis project.β11Jul 13, 2024Updated last year