yashkant / concat-vqaLinks

Official code for the paper "Contrast and Classify: Training Robust VQA Models" published at ICCV, 2021

☆19

Alternatives and similar repositories for concat-vqa

Users that are interested in concat-vqa are comparing it to the libraries listed below

Sorting:

zmykevin / UVLP
CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
☆22Updated 3 years ago
SpencerWhitehead / novelvqa
☆27Updated 4 years ago
zaynmi / seada-vqa
A pytorch implemetation of data augmentation method for visual question answering
☆21Updated 2 years ago
VALUE-Leaderboard / DataRelease
Data Release for VALUE Benchmark
☆30Updated 3 years ago
jialinwu17 / MAVEX
☆30Updated 2 years ago
wenhuchen / Meta-Module-Network
Code for WACV 2021 Paper "Meta Module Network for Compositional Visual Reasoning"
☆43Updated 4 years ago
zinengtang / DeCEMBERT
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
☆17Updated 2 years ago
gujiuxiang / unpaired_image_captioning
Unpaired Image Captioning
☆36Updated 4 years ago
fenglinliu98 / MIA
Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" （NeurIPS 2019）
☆65Updated 5 years ago
jayleicn / VideoLanguageFuturePred
[EMNLP 2020] What is More Likely to Happen Next? Video-and-Language Future Event Prediction
☆51Updated 3 years ago
MikeWangWZHL / VidIL
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
☆115Updated 3 years ago
yanxinzju / CSS-VQA
Counterfactual Samples Synthesizing for Robust VQA
☆79Updated 3 years ago
tejas-gokhale / vqa_mutant
☆13Updated 3 years ago
YuanEZhou / Grounded-Image-Captioning
☆64Updated 3 years ago
yuleiniu / introd
[NeurIPS 2021] Introspective Distillation for Robust Question Answering
☆13Updated 3 years ago
NeverMoreLCH / Awesome-VQA
A reading list of papers about Visual Question Answering.
☆35Updated 3 years ago
MILVLG / rosita
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
☆56Updated 2 years ago
google-deepmind / svo_probes
The SVO-Probes Dataset for Verb Understanding
☆31Updated 3 years ago
mad-red / VSR-guided-CIC
Human-like Controllable Image Captioning with Verb-specific Semantic Roles.
☆36Updated 3 years ago
layer6ai-labs / SGG-Seq2Seq
Code for the ICCV'21 paper "Context-aware Scene Graph Generation with Seq2Seq Transformers"
☆43Updated 3 years ago
cdancette / vqa-cp-leaderboard
A collections of papers about VQA-CP datasets and their results
☆41Updated 3 years ago
cdancette / detect-shortcuts
Repo for ICCV 2021 paper: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering
☆27Updated last year
BigRedT / info-ground
Learning phrase grounding from captioned images through InfoNCE bound on mutual information
☆74Updated 5 years ago
easonnie / mlp-vil
MLPs for Vision and Langauge Modeling (Coming Soon)
☆27Updated 3 years ago
alasdairtran / transform-and-tell
[CVPR 2020] Transform and Tell: Entity-Aware News Image Captioning
☆92Updated last year
LuoweiZhou / coco-caption
kdexd/coco-caption@de6f385
☆26Updated 5 years ago
bladewaltz1 / ModeCap
Controllable mage captioning model with unsupervised modes
☆21Updated 2 years ago
maximek3 / e-ViL
☆40Updated 3 years ago
zmykevin / UC2
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
☆34Updated 4 years ago
luomancs / retriever_reader_for_okvqa
☆18Updated 2 years ago