Taaccoo / awesome-vqa-latestLinks

Visual Question Answering Paper List.

☆53

Alternatives and similar repositories for awesome-vqa-latest

Users that are interested in awesome-vqa-latest are comparing it to the libraries listed below

Sorting:

yuleiniu / cfvqa
[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias
☆126Updated 3 years ago
NeverMoreLCH / Awesome-VQA
A reading list of papers about Visual Question Answering.
☆35Updated 3 years ago
AndersonStra / MuKEA
MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering
☆99Updated 2 years ago
yanxinzju / CSS-VQA
Counterfactual Samples Synthesizing for Robust VQA
☆79Updated 3 years ago
jialinwu17 / MAVEX
☆30Updated 2 years ago
aioz-ai / CFR_VQA
Coarse-to-Fine Reasoning for Visual Question Answering (CVPRW'22)
☆48Updated 3 years ago
CCYChongyanChen / VQA_AlgorithmDatasets
☆38Updated 2 years ago
aditya10 / VLC-BERT
Code for WACV 2023 paper "VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge"
☆21Updated 2 years ago
Zhiquan-Wen / D-VQA
PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)
☆27Updated 3 years ago
terry-r123 / Awesome-Captioning
A curated list of Multimodal Captioning related research(including image captioning, video captioning, and text captioning)
☆112Updated 3 years ago
MikeWangWZHL / VidIL
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
☆115Updated 3 years ago
soloist97 / densecap-pytorch
A simplified pytorch version of densecap
☆42Updated 11 months ago
microsoft / PICa
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA, AAAI 2022 (Oral)
☆86Updated 3 years ago
GT-RIPL / Xmodal-Ctx
Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for …
☆60Updated 3 years ago
guoyang9 / UnifER
Official implementation for the MM'22 paper.
☆13Updated 3 years ago
zmykevin / UVLP
CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
☆22Updated 3 years ago
cdancette / vqa-cp-leaderboard
A collections of papers about VQA-CP datasets and their results
☆41Updated 3 years ago
qinzzz / Multimodal-Alignment-Framework
Implementation for MAF: Multimodal Alignment Framework
☆46Updated 5 years ago
tgc1997 / Awesome-Video-Captioning
A curated list of research papers in Video Captioning
☆121Updated 4 years ago
PhoebusSi / SAR
Code for our ACL2021 paper: "Check It Again: Progressive Visual Question Answering via Visual Entailment"
☆31Updated 4 years ago
CrossmodalGroup / SSL-VQA
Code for our IJCAI2020 paper: Overcoming Language Priors with Self-supervised Learning for Visual Question Answering
☆52Updated 5 years ago
daqingliu / coco-caption
A python3 version of coco-caption with spice.
☆20Updated 5 years ago
bladewaltz1 / ModeCap
Controllable mage captioning model with unsupervised modes
☆21Updated 2 years ago
shubhamagarwal92 / visdial_conv
This repository contains code used in our ACL'20 paper History for Visual Dialog: Do we really need it?
☆34Updated 2 years ago
phellonchen / awesome-visual-dialog
Recent Advances in Visual Dialog
☆30Updated 3 years ago
ZihaoW123 / UniMM
Implementation for the paper "Unified Multimodal Model with Unlikelihood Training for Visual Dialog"
☆13Updated 2 years ago
google-deepmind / svo_probes
The SVO-Probes Dataset for Verb Understanding
☆31Updated 3 years ago
gujiuxiang / unpaired_image_captioning
Unpaired Image Captioning
☆36Updated 4 years ago
yashkant / sam-textvqa
Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.
☆65Updated 4 years ago
jokieleung / CL-VQA
the implementation of EMNLP 2020 "Learning to Contrast the Counterfactual Samples for Robust Visual Question Answering"
☆15Updated 4 years ago