allenai/x-lxmert

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/allenai/x-lxmert)

allenai / x-lxmert

PyTorch code for EMNLP 2020 paper "X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers"

☆50

Alternatives and similar repositories for x-lxmert

Users that are interested in x-lxmert are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zhegan27 / LXMERT-AdvTrain
View on GitHub
Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": LXMERT…
☆21Oct 20, 2020Updated 5 years ago
UCSB-AI / CPL
View on GitHub
Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"
☆35Dec 5, 2022Updated 3 years ago
prdwb / okvqa-release
View on GitHub
☆15May 10, 2021Updated 5 years ago
ronghanghu / lcgn
View on GitHub
Code release for Hu et al., Language-Conditioned Graph Networks for Relational Reasoning. in ICCV, 2019
☆92Aug 9, 2019Updated 6 years ago
zinengtang / VidLanKD
View on GitHub
Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer (NeurIPS 2021))
☆56Feb 6, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
researchmm / generate-it
View on GitHub
A collection of models for image<->text generation in ACM MM 2021.
☆67Oct 31, 2021Updated 4 years ago
jiasenlu / vilbert_beta
View on GitHub
☆478Nov 21, 2022Updated 3 years ago
zzxslp / XL-VLN
View on GitHub
Dataset for Bilingual VLN
☆11Dec 5, 2020Updated 5 years ago
j-min / VL-T5
View on GitHub
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
☆372Jul 29, 2023Updated 2 years ago
bearcatt / LaBERT
View on GitHub
A length-controllable and non-autoregressive image captioning model.
☆69Jun 10, 2021Updated 5 years ago
ck0123 / improved-bertscore-for-image-captioning-evaluation
View on GitHub
☆21Jul 25, 2024Updated 2 years ago
uclanlp / visualbert
View on GitHub
Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"
☆542May 1, 2023Updated 3 years ago
microsoft / Oscar
View on GitHub
Oscar and VinVL
☆1,054Aug 28, 2023Updated 2 years ago
VegB / VLN-Transformer
View on GitHub
Implementation of "Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation"
☆27Mar 4, 2021Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
yuweihao / KERN
View on GitHub
Code for Knowledge-Embedded Routing Network for Scene Graph Generation (CVPR 2019)
☆121Aug 17, 2022Updated 3 years ago
JaywongWang / CBP
View on GitHub
Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware P…
☆59Mar 24, 2023Updated 3 years ago
chihyaoma / cyclical-visual-captioning
View on GitHub
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision
☆46Jul 29, 2020Updated 5 years ago
airsplay / lxmert
View on GitHub
PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
☆965Oct 22, 2022Updated 3 years ago
ronghanghu / snmn
View on GitHub
Code release for Hu et al., Explainable Neural Computation via Stack Neural Module Networks. in ECCV, 2018
☆71Nov 17, 2019Updated 6 years ago
ARIES-LM / GMNMT
View on GitHub
☆30Nov 3, 2020Updated 5 years ago
EricWWWW / image-caption-metrics
View on GitHub
a py3 lib for NLP & image-caption metrics : BLEU METEOR CIDEr ROUGE SPICE WMD
☆14Sep 13, 2022Updated 3 years ago
ExplorerFreda / VGNSL
View on GitHub
[ACL 2019] Visually Grounded Neural Syntax Acquisition
☆90Feb 24, 2024Updated 2 years ago
e-bug / volta
View on GitHub
[TACL 2021] Code and data for the framework in "Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-La…
☆115Mar 24, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
j-min / DallEval
View on GitHub
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models (ICCV 2023)
☆143Jun 10, 2025Updated last year
wangzheallen / STL-VQA
View on GitHub
The good practice in the VQA system such as pos-tag attention, structed triplet learning and triplet attention is very general and can be…
☆19Jan 23, 2018Updated 8 years ago
he-dhamo / simsg
View on GitHub
Semantic Image Manipulation using Scene Graphs (CVPR 2020)
☆60May 1, 2023Updated 3 years ago
facebookresearch / vilbert-multi-task
View on GitHub
Multi Task Vision and Language
☆824Feb 16, 2022Updated 4 years ago
sibeiyang / sgmn
View on GitHub
Graph-Structured Referring Expressions Reasoning in The Wild, In CVPR 2020, Oral.
☆117Aug 10, 2020Updated 5 years ago
zaynmi / seada-vqa
View on GitHub
A pytorch implemetation of data augmentation method for visual question answering
☆21May 25, 2023Updated 3 years ago
INK-USC / VisCOLL
View on GitHub
Code and data for the project "Visually grounded continual learning of compositional semantics"
☆22Dec 27, 2022Updated 3 years ago
husthuaan / AoANet
View on GitHub
Code for paper "Attention on Attention for Image Captioning". ICCV 2019
☆339May 2, 2021Updated 5 years ago
researchmm / soho
View on GitHub
[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
☆208Sep 30, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
SpencerWhitehead / novelvqa
View on GitHub
☆27Oct 7, 2021Updated 4 years ago
lil-lab / vgnsl_analysis
View on GitHub
"What is Learned in Visually Grounded Neural Syntax Acquisition", Noriyuki Kojima, Hadar Averbuch-Elor, Alexander Rush and Yoav Artzi (AC…
☆12Dec 30, 2021Updated 4 years ago
CompVis / imagebart
View on GitHub
ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis
☆126Mar 14, 2022Updated 4 years ago
krismuniz / google-kgsearch
View on GitHub
A simple wrapper for Google's Knowledge Graph Search API.
☆14Apr 19, 2017Updated 9 years ago
khanhptnk / hanna
View on GitHub
Visual Navigation with Natural Multimodal Assistance (EMNLP 2019)
☆29Jun 30, 2020Updated 6 years ago
kdexd / virtex
View on GitHub
[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations
☆561Aug 22, 2025Updated 11 months ago
linjieli222 / HERO
View on GitHub
Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
☆235Sep 16, 2021Updated 4 years ago