THUNLP-MT / FIIGLinks

Filling the Image Information Gap for VQA: Prompting Large Language Models to Proactively Ask Questions (EMNLP 2023 Findings)

☆8

Alternatives and similar repositories for FIIG

Users that are interested in FIIG are comparing it to the libraries listed below

Sorting:

GaryJiajia / OFv2_ICL_VQA
[CVPR 2024] How to Configure Good In-Context Sequence for Visual Question Answering
☆20Updated 3 months ago
open-vision-language / infoseek
☆60Updated last year
allenai / aokvqa
Official repository for the A-OKVQA dataset
☆97Updated last year
RUCAIBox / POPE
The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''
☆219Updated 2 weeks ago
edchengg / infoseek_eval
EMNLP2023 - InfoSeek: A New VQA Benchmark focus on Visual Info-Seeking Questions
☆25Updated last year
Gary-code / KECVQG
[ACM MM 2023] The released code of paper "Deconfounded Visual Question Generation with Causal Inference"
☆10Updated last year
X-PLUG / mPLUG-HalOwl
mPLUG-HalOwl: Multimodal Hallucination Evaluation and Mitigating
☆96Updated last year
AndersonStra / MuKEA
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering
☆97Updated 2 years ago
vipulgupta1011 / swapmix
☆20Updated 2 years ago
PhoebusSi / MMBS
Code for our EMNLP-2022 paper: "Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning"
☆16Updated 2 years ago
doc-doc / NExT-OE
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
☆30Updated 2 years ago
doc-doc / NExT-QA
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)
☆169Updated last month
Zhiquan-Wen / D-VQA
PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)
☆25Updated 2 years ago
Yushi-Hu / PromptCap
natual language guided image captioning
☆85Updated last year
YiyangZhou / LURE
[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
☆149Updated last year
PLUM-Lab / MultiInstruct
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
☆135Updated 2 years ago
luomancs / ReMuQ
a multimodal retrieval dataset
☆24Updated 2 years ago
yuezih / less-is-more
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)
☆55Updated 10 months ago
VRU-NExT / VideoQA
☆95Updated 2 years ago
phellonchen / awesome-visual-dialog
Recent Advances in Visual Dialog
☆30Updated 3 years ago
LightChen233 / M3CoT
☆81Updated last year
microsoft / PICa
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA, AAAI 2022 (Oral)
☆85Updated 3 years ago
HaozheZhao / MIC_tool
☆14Updated last year
thunlp / PEVL
Source code for EMNLP 2022 paper “PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models”
☆48Updated 2 years ago
FuxiaoLiu / LRV-Instruction
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
☆285Updated last year
open-vision-language / oven
☆40Updated 2 years ago
The-Martyr / Awesome-Multimodal-Reasoning
Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models
☆36Updated this week
xieyuquanxx / awesome-Large-MultiModal-Hallucination
😎 curated list of awesome LMM hallucinations papers, methods & resources.
☆149Updated last year
PhoebusSi / VQA-VS
Code for our EMNLP-2022 paper: "Language Prior Is Not the Only Shortcut: A Benchmark for Shortcut Learning in VQA"
☆40Updated 2 years ago
Go2Heart / EchoSight
[EMNLP 2024 Findings] The official PyTorch implementation of EchoSight: Advancing Visual-Language Models with Wiki Knowledge.
☆73Updated 2 months ago