YueYANG1996 / wikiHow-VGSILinks

EMNLP 2021: Visual Goal-Step Inference using wikiHow

☆13

Alternatives and similar repositories for wikiHow-VGSI

Users that are interested in wikiHow-VGSI are comparing it to the libraries listed below

Sorting:

xxxiaol / spatial-commonsense
Source code and data for Things not Written in Text: Exploring Spatial Commonsense from Visual Signals (ACL2022 main conference paper).
☆20Updated 2 years ago
ChenyuHeidiZhang / VL-commonsense
☆15Updated 3 years ago
VegB / iNLG
Implementation of "Visualize Before You Write: Imagination-Guided Open-Ended Text Generation".
☆17Updated 2 years ago
MichaelZhouwang / VLUE
This repo contains codes and instructions for baselines in the VLUE benchmark.
☆41Updated 3 years ago
Victorwz / VaLM
VaLM: Visually-augmented Language Modeling. ICLR 2023.
☆56Updated 2 years ago
bobwan1995 / cliora
Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling
☆36Updated 3 years ago
maximek3 / e-ViL
☆40Updated 2 years ago
e-bug / cross-modal-ablation
[EMNLP 2021] Code and data for our paper "Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers…
☆20Updated 3 years ago
allenai / sherlock
Code, data, models for the Sherlock corpus
☆58Updated 2 years ago
e-bug / volta
[TACL 2021] Code and data for the framework in "Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-La…
☆114Updated 3 years ago
WadeYin9712 / GD-VCR
Code and data for "Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning" (EMNLP 2021).
☆29Updated 3 years ago
necla-ml / SNLI-VE
Dataset and starting code for visual entailment dataset
☆111Updated 3 years ago
ajd12342 / why-winoground-hard
Code for 'Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality', EMNLP 2022
☆30Updated 2 years ago
salesforce / VD-BERT
☆45Updated 2 months ago
jamespark3922 / visual-comet
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
☆87Updated 2 years ago
wenhuchen / Meta-Module-Network
Code for WACV 2021 Paper "Meta Module Network for Compositional Visual Reasoning"
☆43Updated 4 years ago
google-research-datasets / Crisscrossed-Captions
Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO
☆52Updated 4 years ago
zinengtang / VidLanKD
Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer (NeurIPS 2021))
☆56Updated 2 years ago
tingyaohsu / VIST-Edit
Visual Storytelling post-edit dataset
☆18Updated 5 years ago
shubhamagarwal92 / visdial_conv
This repository contains code used in our ACL'20 paper History for Visual Dialog: Do we really need it?
☆34Updated 2 years ago
Sy-Zhang / MMC-PCFG
Video-aided Unsupervised Grammar Induction, NAACL‘21 [best long paper]
☆40Updated 2 years ago
YujieLu10 / IACE-NLU
Official repo for "Imagination-Augmented Natural Language Understanding", NAACL 2022.
☆17Updated 3 years ago
raeidsaqur / mgn
Multimodal Graph Network (MGN): Code repo, examples from the paper
☆25Updated 4 years ago
woojeongjin / FewVLM
A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models (ACL 2022)
☆42Updated 3 years ago
limanling / KnowledgeVL-Reading
☆68Updated 2 years ago
e-bug / iglue
[ICML 2022] Code and data for our paper "IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages"
☆49Updated 2 years ago
MikeWangWZHL / VidIL
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
☆115Updated 2 years ago
zmykevin / UC2
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
☆34Updated 3 years ago
HKUST-KnowComp / Visual_PCR
Dataset and Source code for EMNLP 2019 paper "What You See is What You Get: Visual Pronoun Coreference Resolution in Dialogues"
☆26Updated 3 years ago
rowanz / merlot
MERLOT: Multimodal Neural Script Knowledge Models
☆224Updated 3 years ago