jiazhen-code / PhD
A Prompted Visual Hallucination Evaluation Dataset, featuring over 100,000 data points and four advanced evaluation modes. The dataset includes extensive contextual descriptions, counterintuitive images, and clear indicators of hallucination elements.
☆11Updated last week
Related projects ⓘ
Alternatives and complementary repositories for PhD
- This is the first released survey paper on hallucinations of large vision-language models (LVLMs). To keep track of this field and contin…☆46Updated 3 months ago
- [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"☆70Updated 5 months ago
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆40Updated last year
- Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal …☆27Updated last week
- Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos☆34Updated 6 months ago
- [ACL’24 Findings] Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives☆32Updated 2 months ago
- NewsCLIPpings: Automatic Generation of Out-of-Context Multimodal Media, EMNLP 2021☆33Updated 2 months ago
- ☆14Updated last year
- SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection☆31Updated 2 months ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆134Updated 6 months ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆31Updated 2 weeks ago
- mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating☆79Updated 9 months ago
- ☆11Updated 10 months ago
- Official repository for CoMM Dataset☆24Updated last month
- ☆24Updated 4 months ago
- Official code for our paper "Model Composition for Multimodal Large Language Models"☆17Updated 6 months ago
- ☆79Updated 2 years ago
- Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)☆58Updated 4 months ago
- [EACL'23] COVID-VTS: Fact Extraction and Verification on Short Video Platforms☆9Updated last year
- Official repository for the A-OKVQA dataset☆63Updated 6 months ago
- [Preprint] TRACE: Temporal Grounding Video LLM via Casual Event Modeling☆36Updated this week
- ☆22Updated 3 months ago
- [NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models☆32Updated 7 months ago
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆178Updated 7 months ago
- Official Code for the ICCV23 Paper: "LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval…☆41Updated last year
- 😎 up-to-date & curated list of awesome LMM hallucinations papers, methods & resources.☆144Updated 7 months ago
- NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)☆27Updated last year
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆29Updated 7 months ago
- Official Repository for CVPR 2022 paper "REX: Reasoning-aware and Grounded Explanation"☆18Updated 11 months ago