jungokasai / THumBLinks

☆15

Alternatives and similar repositories for THumB

Users that are interested in THumB are comparing it to the libraries listed below

Sorting:

WadeYin9712 / GD-VCR
Code and data for "Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning" (EMNLP 2021).
☆29Updated 4 years ago
j-min / VL-T5
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
☆374Updated 2 years ago
Yebin46 / FLEUR
[ACL 2024] FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model
☆17Updated 8 months ago
e-bug / volta
[TACL 2021] Code and data for the framework in "Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-La…
☆114Updated 3 years ago
e-bug / iglue
[ICML 2022] Code and data for our paper "IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages"
☆49Updated 3 years ago
facebookresearch / simmc2
Code for SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations
☆106Updated 3 years ago
allenai / multimodalqa
☆147Updated 3 years ago
allenai / sherlock
Code, data, models for the Sherlock corpus
☆59Updated 3 years ago
nttmdlab-nlp / VisualMRC
VisualMRC: Machine Reading Comprehension on Document Images (AAAI2021)
☆56Updated 9 months ago
hwanheelee1993 / UMIC
An unreferenced image captioning metric (ACL-21)
☆30Updated last year
salesforce / VD-BERT
☆44Updated 6 months ago
google-research-datasets / Crisscrossed-Captions
Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO
☆54Updated 5 years ago
xxxiaol / spatial-commonsense
Source code and data for Things not Written in Text: Exploring Spatial Commonsense from Visual Signals (ACL2022 main conference paper).
☆20Updated 3 years ago
zmykevin / UC2
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
☆34Updated 4 years ago
ExplainableML / CLEVR-X
CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations
☆29Updated 2 years ago
YujieLu10 / IACE-NLU
Official repo for "Imagination-Augmented Natural Language Understanding", NAACL 2022.
☆17Updated 3 years ago
libeineu / fairseq_mmt
This code repository is for the accepted ACL2022 paper "On Vision Features in Multimodal Machine Translation". We provide the details and…
☆43Updated 3 years ago
necla-ml / SNLI-VE
Dataset and starting code for visual entailment dataset
☆118Updated 3 years ago
VegB / iNLG
Implementation of "Visualize Before You Write: Imagination-Guided Open-Ended Text Generation".
☆17Updated 2 years ago
zinengtang / VidLanKD
Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer (NeurIPS 2021))
☆56Updated 2 years ago
SALT-NLP / Adaptive-Compositional-Modules
Code for the ACL 2022 paper "Continual Sequence Generation with Adaptive Compositional Modules"
☆39Updated 3 years ago
woojeongjin / FewVLM
A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models (ACL 2022)
☆43Updated 3 years ago
idansc / mrr-ndcg
☆18Updated last year
HAWLYQ / InfoMetIC
☆13Updated 2 years ago
LividWo / Revisit-MMT
☆25Updated 4 years ago
airsplay / vokenization
PyTorch code for EMNLP 2020 Paper "Vokenization: Improving Language Understanding with Visual Supervision"
☆192Updated 4 years ago
jamespark3922 / visual-comet
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
☆88Updated 2 years ago
maximek3 / e-ViL
☆40Updated 3 years ago
shubhamagarwal92 / visdial_conv
This repository contains code used in our ACL'20 paper History for Visual Dialog: Do we really need it?
☆34Updated 2 years ago
ChenyuHeidiZhang / VL-commonsense
☆15Updated 3 years ago