yikuan8/Transformers-VQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yikuan8/Transformers-VQA)

yikuan8 / Transformers-VQA

An implementation that downstreams pre-trained V+L models to VQA tasks. Now support: VisualBERT, LXMERT, and UNITER

☆165

Alternatives and similar repositories for Transformers-VQA

Users that are interested in Transformers-VQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

airsplay / lxmert
View on GitHub
PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
☆965Oct 22, 2022Updated 3 years ago
airsplay / py-bottom-up-attention
View on GitHub
PyTorch bottom-up attention with Detectron2
☆239Jan 4, 2022Updated 4 years ago
uclanlp / visualbert
View on GitHub
Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"
☆542May 1, 2023Updated 3 years ago
gchhablani / multilingual-vqa
View on GitHub
Repository for Multilingual-VQA task created during HuggingFace JAX/Flax community week.
☆33Jul 27, 2021Updated 5 years ago
ChenRocks / UNITER
View on GitHub
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
☆799Jun 30, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
MayankSingal / VQA-Transformer
View on GitHub
Visual Question Answering through transformers.
☆13Sep 21, 2018Updated 7 years ago
VirajBagal / MMBERT
View on GitHub
MMBERT: Multimodal BERT Pretraining for Improved Medical VQA
☆39Mar 22, 2021Updated 5 years ago
tejas-gokhale / vqa_mutant
View on GitHub
☆13Feb 14, 2022Updated 4 years ago
CCYChongyanChen / VQA_AlgorithmDatasets
View on GitHub
☆37Jan 20, 2023Updated 3 years ago
ThalesGroup / ConceptBERT
View on GitHub
Implementation of ConceptBert: Concept-Aware Representation for Visual Question Answering
☆31Apr 30, 2024Updated 2 years ago
aioz-ai / MICCAI21_MMQ
View on GitHub
Multiple Meta-model Quantifying for Medical Visual Question Answering (MICCAI 2021)
☆37Apr 21, 2026Updated 3 months ago
facebookresearch / grid-feats-vqa
View on GitHub
Grid features pre-training code for visual question answering
☆269Sep 17, 2021Updated 4 years ago
vuhoangminh / vqa_medical
View on GitHub
☆10Oct 20, 2022Updated 3 years ago
Awenbocc / med-vqa
View on GitHub
Medical Visual Question Answering via Conditional Reasoning [ACM MM 2020]
☆64Aug 20, 2021Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
abachaa / VQA-Med-2019
View on GitHub
Visual Question Answering in the Medical Domain VQA-Med 2019
☆95May 13, 2026Updated 2 months ago
yanxinzju / CSS-VQA
View on GitHub
Counterfactual Samples Synthesizing for Robust VQA
☆78Nov 24, 2022Updated 3 years ago
jackroos / VL-BERT
View on GitHub
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".
☆742May 22, 2023Updated 3 years ago
facebookresearch / vilbert-multi-task
View on GitHub
Multi Task Vision and Language
☆824Feb 16, 2022Updated 4 years ago
microsoft / Oscar
View on GitHub
Oscar and VinVL
☆1,054Aug 28, 2023Updated 2 years ago
zhegan27 / LXMERT-AdvTrain
View on GitHub
Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": LXMERT…
☆21Oct 20, 2020Updated 5 years ago
HimariO / HatefulMemesChallenge
View on GitHub
☆93Dec 14, 2022Updated 3 years ago
cdancette / detect-shortcuts
View on GitHub
Repo for ICCV 2021 paper: Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering
☆29Jul 1, 2024Updated 2 years ago
researchmm / soho
View on GitHub
[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
☆208Sep 30, 2022Updated 3 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
SuperSupermoon / MedViLL
View on GitHub
MedViLL official code. (Published IEEE JBHI 2021)
☆110Dec 26, 2024Updated last year
ricbl / eyetracking
View on GitHub
This code was used to collect, process, and validate the REFLACX (Reports and Eye-Tracking Data for Localization of Abnormalities in Ches…
☆20Apr 6, 2022Updated 4 years ago
yuewang-cuhk / awesome-vision-language-pretraining-papers
View on GitHub
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
☆1,159Aug 19, 2022Updated 3 years ago
zh-plus / Awesome-VLP-and-Efficient-Transformer
View on GitHub
Vision-Language Pretraining & Efficient Transformer Papers.
☆15Nov 30, 2021Updated 4 years ago
KaihuaTang / VQA2.0-Recent-Approachs-2018.pytorch
View on GitHub
A pytroch reimplementation of "Bilinear Attention Network", "Intra- and Inter-modality Attention", "Learning Conditioned Graph Structures…
☆300Jan 6, 2026Updated 6 months ago
guoyang9 / UnifER
View on GitHub
Official implementation for the MM'22 paper.
☆14Jun 30, 2022Updated 4 years ago
zhegan27 / VILLA
View on GitHub
Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER…
☆119Jan 13, 2021Updated 5 years ago
abachaa / VQA-Med-2020
View on GitHub
VQA-Med 2020
☆16May 13, 2026Updated 2 months ago
Adam1679 / mutan-article-net
View on GitHub
Implementation of Mutan+ArticleNet on OKVQA
☆10Jan 11, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hengyuan-hu / bottom-up-attention-vqa
View on GitHub
An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.
☆768Mar 10, 2024Updated 2 years ago
Cyanogenoid / pytorch-vqa
View on GitHub
Strong baseline for visual question answering
☆240Mar 13, 2023Updated 3 years ago
LuoweiZhou / VLP
View on GitHub
Vision-Language Pre-training for Image Captioning and Question Answering
☆420Jan 18, 2022Updated 4 years ago
jnhwkim / ban-vqa
View on GitHub
Bilinear attention networks for visual question answering
☆548Oct 30, 2023Updated 2 years ago
peteanderson80 / bottom-up-attention
View on GitHub
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
☆1,470Feb 3, 2023Updated 3 years ago
zizhaozhang / distill2
View on GitHub
☆12Jun 21, 2022Updated 4 years ago
lichengunc / pretrain-vl-data
View on GitHub
Pre-trained V+L Data Preparation
☆47Jun 2, 2020Updated 6 years ago