jiazhen-code / PhD
[CVPR25] A ChatGPT-Prompted Visual hallucination Evaluation Dataset, featuring over 100,000 data samples and four advanced evaluation modes. The dataset includes extensive contextual descriptions, counterintuitive images, and clear indicators of hallucination items.
☆14Updated last month
Alternatives and similar repositories for PhD:
Users that are interested in PhD are comparing it to the libraries listed below
- This is the first released survey paper on hallucinations of large vision-language models (LVLMs). To keep track of this field and contin…☆63Updated 8 months ago
- [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"☆85Updated 3 months ago
- [CVPR 2024] How to Configure Good In-Context Sequence for Visual Question Answering☆17Updated 7 months ago
- [ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"☆49Updated 6 months ago
- 🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".☆30Updated 2 weeks ago
- [ICLR 2025] TRACE: Temporal Grounding Video LLM via Casual Event Modeling☆76Updated 2 months ago
- ☆107Updated last month
- Instruction Tuning in Continual Learning paradigm☆44Updated last month
- mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating☆94Updated last year
- A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.☆47Updated 2 weeks ago
- ☆36Updated 2 years ago
- [NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models☆41Updated last year
- The official implementation of paper "Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval" accepted by NeurIPS…☆24Updated 10 months ago
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆47Updated last year
- A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CV…☆22Updated 3 weeks ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆145Updated 11 months ago
- ☆45Updated 4 months ago
- Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models☆17Updated this week
- [ACL’24 Findings] Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives☆38Updated 7 months ago
- [CVPR 2024] Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension☆48Updated 11 months ago
- [SIGIR 2024] - Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval☆31Updated 8 months ago
- Official resource for paper Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models (ACL 20…☆9Updated 7 months ago
- ☆19Updated 8 months ago
- VQACL: A Novel Visual Question Answering Continual Learning Setting (CVPR'23)☆35Updated last year
- Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos☆40Updated 11 months ago
- [ICCV2023] - CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation☆31Updated 5 months ago
- [EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.☆55Updated 2 weeks ago
- the official repo for EMNLP 2024 (main) paper "EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimo…☆19Updated 2 weeks ago
- ☆15Updated last year
- up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources☆107Updated last month