e-bug/volta

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/e-bug/volta)

e-bug / volta

[TACL 2021] Code and data for the framework in "Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs"

☆115

Alternatives and similar repositories for volta

Users that are interested in volta are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

e-bug / cross-modal-ablation
View on GitHub
[EMNLP 2021] Code and data for our paper "Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers…
☆20Jan 17, 2022Updated 4 years ago
e-bug / iglue
View on GitHub
[ICML 2022] Code and data for our paper "IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages"
☆49Dec 7, 2022Updated 3 years ago
marvl-challenge / marvl-code
View on GitHub
[EMNLP 2021] Code and data for our paper "Visually Grounded Reasoning across Languages and Cultures"
☆30Dec 30, 2021Updated 4 years ago
ChenRocks / UNITER
View on GitHub
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
☆800Jun 30, 2021Updated 5 years ago
zhegan27 / LXMERT-AdvTrain
View on GitHub
Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": LXMERT…
☆21Oct 20, 2020Updated 5 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
facebookresearch / vilbert-multi-task
View on GitHub
Multi Task Vision and Language
☆824Feb 16, 2022Updated 4 years ago
zzxslp / XL-VLN
View on GitHub
Dataset for Bilingual VLN
☆11Dec 5, 2020Updated 5 years ago
zhegan27 / VILLA
View on GitHub
Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER…
☆119Jan 13, 2021Updated 5 years ago
zmykevin / UC2
View on GitHub
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
☆34Nov 9, 2021Updated 4 years ago
facebookresearch / grid-feats-vqa
View on GitHub
Grid features pre-training code for visual question answering
☆269Sep 17, 2021Updated 4 years ago
malihealikhani / Cross-modal_Coherence_Modeling
View on GitHub
Cross-modal Coherence Modeling for Caption Generation
☆11Jul 24, 2020Updated 5 years ago
yuewang-cuhk / awesome-vision-language-pretraining-papers
View on GitHub
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
☆1,159Aug 19, 2022Updated 3 years ago
researchmm / soho
View on GitHub
[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
☆208Sep 30, 2022Updated 3 years ago
McGill-NLP / imagecode
View on GitHub
Code and data for ImageCoDe, a contextual vison-and-language benchmark
☆42Mar 1, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
jackroos / VL-BERT
View on GitHub
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".
☆744May 22, 2023Updated 3 years ago
uclanlp / visualbert
View on GitHub
Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"
☆543May 1, 2023Updated 3 years ago
microsoft / Oscar
View on GitHub
Oscar and VinVL
☆1,054Aug 28, 2023Updated 2 years ago
hwanheelee1993 / ViLBERTScore
View on GitHub
Code for ViLBERTScore in EMNLP Eval4NLP
☆18Oct 27, 2022Updated 3 years ago
delchiaro / RATT
View on GitHub
☆18Oct 3, 2023Updated 2 years ago
cambridgeltl / ECNMT
View on GitHub
Emergent Communication Pretraining for Few-Shot Machine Translation
☆13Dec 3, 2020Updated 5 years ago
maryamziaa / ConceptBERT
View on GitHub
☆10Jul 23, 2021Updated 4 years ago
ck0123 / improved-bertscore-for-image-captioning-evaluation
View on GitHub
☆21Jul 25, 2024Updated last year
hardyqr / Visual-Semantic-Embeddings-an-incomplete-list
View on GitHub
A paper list of visual semantic embeddings and text-image retrieval.
☆41Dec 4, 2020Updated 5 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
airsplay / lxmert
View on GitHub
PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
☆967Oct 22, 2022Updated 3 years ago
gsig / visual-grounding
View on GitHub
Project page for "Visual Grounding in Video for Unsupervised Word Translation" CVPR 2020
☆43Apr 26, 2020Updated 6 years ago
allenai / x-lxmert
View on GitHub
PyTorch code for EMNLP 2020 paper "X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers"
☆50Aug 27, 2021Updated 4 years ago
facebookresearch / connect-caption-and-trace
View on GitHub
A unified framework to jointly model images, text, and human attention traces.
☆80May 24, 2021Updated 5 years ago
airsplay / vokenization
View on GitHub
PyTorch code for EMNLP 2020 Paper "Vokenization: Improving Language Understanding with Visual Supervision"
☆191Mar 8, 2021Updated 5 years ago
airsplay / py-bottom-up-attention
View on GitHub
PyTorch bottom-up attention with Detectron2
☆239Jan 4, 2022Updated 4 years ago
lichengunc / pretrain-vl-data
View on GitHub
Pre-trained V+L Data Preparation
☆47Jun 2, 2020Updated 6 years ago
Zhiquan-Wen / D-VQA
View on GitHub
PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)
☆26Oct 13, 2022Updated 3 years ago
LuoweiZhou / VLP
View on GitHub
Vision-Language Pre-training for Image Captioning and Question Answering
☆421Jan 18, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ExplorerFreda / VGNSL
View on GitHub
[ACL 2019] Visually Grounded Neural Syntax Acquisition
☆90Feb 24, 2024Updated 2 years ago
Heidelberg-NLP / VALSE
View on GitHub
Data repository for the VALSE benchmark.
☆40Feb 15, 2024Updated 2 years ago
lil-lab / nlvr
View on GitHub
Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is an…
☆270Aug 18, 2022Updated 3 years ago
chihyaoma / cyclical-visual-captioning
View on GitHub
PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision
☆46Jul 29, 2020Updated 5 years ago
kywen1119 / DSRAN
View on GitHub
Code for journal paper "Learning Dual Semantic Relations with Graph Attention for Image-Text Matching", TCSVT, 2020.
☆74Oct 25, 2022Updated 3 years ago
WebQnA / WebQA
View on GitHub
☆68Jan 3, 2025Updated last year
mitjanikolaus / compositional-image-captioning
View on GitHub
Code for the CoNLL 2019 paper "Compositional Generalization in Image Captioning" by Mitja Nikolaus, Mostafa Abdou, Matthew Lamm, Rahul Ar…
☆26Jun 14, 2020Updated 6 years ago