makarandtapaswi / MovieQA_benchmarkLinks

Benchmark data and code for Question-Answering on Movie stories

☆46

Alternatives and similar repositories for MovieQA_benchmark

Users that are interested in MovieQA_benchmark are comparing it to the libraries listed below

Sorting:

GuessWhatGame / guesswhat
GuessWhat?! Baselines
☆74Updated 3 years ago
jayleicn / TVQA
[EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering
☆181Updated 3 years ago
lichengunc / pretrain-vl-data
Pre-trained V+L Data Preparation
☆46Updated 5 years ago
satwikkottur / clevr-dialog
Repository to generate CLEVR-Dialog: A diagnostic dataset for Visual Dialog
☆49Updated 5 years ago
idansc / simple-avsd
Code for ''A Simple Baseline for Audio-Visual Scene-Aware Dialog``
☆26Updated 5 years ago
hudaAlamri / DSTC7-Audio-Visual-Scene-Aware-Dialog-AVSD-Challenge
☆53Updated 6 years ago
ronghanghu / snmn
Code release for Hu et al., Explainable Neural Computation via Stack Neural Module Networks. in ECCV, 2018
☆71Updated 6 years ago
raingo / TGIF-Release
Animated GIF Description Dataset
☆116Updated last year
cvlab-columbia / expert
Code for Learning to Learn Language from Narrated Video
☆33Updated 2 years ago
airsplay / VisualRelationships
Data of ACL 2019 Paper "Expressing Visual Relationships via Language".
☆62Updated 5 years ago
rosinality / mac-network-pytorch
Memory, Attention and Composition (MAC) Network for CLEVR implemented in PyTorch
☆85Updated 6 years ago
ExplorerFreda / VGNSL
[ACL 2019] Visually Grounded Neural Syntax Acquisition
☆90Updated last year
aylai / DenotationGraph
Generate a denotation graph from a set of image captions
☆15Updated 7 years ago
jimmy646 / violin
Data and code for CVPR 2020 paper: "VIOLIN: A Large-Scale Dataset for Video-and-Language Inference"
☆162Updated 5 years ago
cvlab-columbia / globetrotter
Code for the Globetrotter project
☆23Updated 3 years ago
yiyang92 / vae_captioning
Implementation of Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space
☆59Updated 7 years ago
yikang-li / iQAN
Visaul Question Generation as Dual Task of Visual Question Answering (PyTorch Version)
☆82Updated 7 years ago
antoine77340 / Mixture-of-Embedding-Experts
Mixture-of-Embeddings-Experts
☆120Updated 5 years ago
LuoweiZhou / YouCook2-Leaderboard
A one-stop shop for YouCook2 info such as leaderboard and recent advances on (cooking) video retrieval and captioning.
☆40Updated 3 years ago
DmZhukov / CrossTask
☆93Updated 3 years ago
ruotianluo / GoogleConceptualCaptioning
☆54Updated 5 years ago
facebookresearch / corefnmn
Visual Coreference Resolution in Visual Dialog using Neural Module Networks
☆57Updated 4 years ago
yuleiniu / rva
Code for CVPR'19 "Recursive Visual Attention in Visual Dialog"
☆64Updated 2 years ago
ronghanghu / cmn
Code release for Hu et al. Modeling Relationships in Referential Expressions with Compositional Modular Networks. in CVPR, 2017
☆67Updated 7 years ago
hassanhub / MultiGrounding
This is the repo for Multi-level textual grounding
☆34Updated 5 years ago
peteanderson80 / coco-caption
Adds SPICE metric to coco-caption evaluation server codes
☆50Updated 2 years ago
ruotianluo / refexp-comprehension
Referring expression comprehension on ReferIt(RefClef)
☆10Updated 8 years ago
ronghanghu / gqa_single_hop_baseline
A simple but well-performing "single-hop" visual attention model for the GQA dataset
☆20Updated 6 years ago
lichengunc / speaker_listener_reinforcer
Torch Implementation of Speaker-Listener-Reinforcer for Referring Expression Generation and Comprehension
☆34Updated 7 years ago
peteanderson80 / SPICE
Semantic Propositional Image Caption Evaluation
☆144Updated 2 years ago