ZuyiZhou / Awesome-Interpretable-Cross-modal-Reasoning
A Survey on Interpretable Cross-modal Reasoning
☆13Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Awesome-Interpretable-Cross-modal-Reasoning
- Official repository for the A-OKVQA dataset☆64Updated 6 months ago
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆33Updated 3 weeks ago
- ☆24Updated 4 months ago
- mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating☆79Updated 9 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.☆32Updated last week
- DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆15Updated 3 months ago
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆41Updated last year
- [Paper][IJCKG 2022] LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection☆25Updated 9 months ago
- [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"☆68Updated 6 months ago
- ☆10Updated 2 months ago
- ☆75Updated 3 weeks ago
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆182Updated 7 months ago
- 😎 up-to-date & curated list of awesome LMM hallucinations papers, methods & resources.☆146Updated 7 months ago
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)☆29Updated 7 months ago
- ☆83Updated 2 years ago
- This is the first released survey paper on hallucinations of large vision-language models (LVLMs). To keep track of this field and contin…☆47Updated 3 months ago
- ☆16Updated last year
- ☆63Updated 5 years ago
- my commonly-used tools☆47Updated 3 months ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆135Updated 6 months ago
- ☆18Updated 2 years ago
- ☆14Updated last year
- Source code for EMNLP 2022 paper “PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models”☆47Updated 2 years ago
- HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)☆41Updated 4 months ago
- VQACL: A Novel Visual Question Answering Continual Learning Setting (CVPR'23)☆31Updated 7 months ago
- Official code for our paper "Model Composition for Multimodal Large Language Models"☆18Updated 6 months ago
- ☆68Updated last year
- MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering☆88Updated last year
- [ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Spa…☆48Updated 4 months ago