malmaud / whats_cookin
Dataset generated by the methods in "What's Cookin'? Interpreting Cooking Videos using Text, Speech and Vision"
☆20Updated 9 years ago
Alternatives and similar repositories for whats_cookin:
Users that are interested in whats_cookin are comparing it to the libraries listed below
- Code for reproducing the results in "Mining Semantic Affordances of Visual Object Categories"☆10Updated 7 months ago
- ☆11Updated 7 years ago
- Visual Storytelling API☆35Updated 7 years ago
- annotated screenplays for 39 CSI:Crime Scene Investigation episodes for paper "Whodunnit? Crime Drama as a Case for Natural Language Unde…☆46Updated 4 years ago
- Generate a denotation graph from a set of image captions☆15Updated 6 years ago
- Benchmark data and code for Question-Answering on Movie stories☆42Updated 4 years ago
- Implement Natural Language Object Retrieval in tensorflow☆35Updated 8 years ago
- Code for Unsupervised Discovery of Multimodal Links in Multi-Image/Multi-Sentence Documents☆30Updated 4 years ago
- Learning visually grounded word embeddings using Abstract scenes☆18Updated 5 years ago
- Visual Verb Sense Disambiguation☆13Updated 5 years ago
- Website for TextVQA dataset.☆28Updated last year
- GuessWhat?! Baselines☆72Updated 2 years ago
- Representations of language in a model of visually grounded speech signal.☆23Updated 6 years ago
- vist story telling evaluation tool☆21Updated last year
- ☆24Updated 8 years ago
- Referring expression comprehension on ReferIt(RefClef)☆9Updated 8 years ago
- Repository to generate CLEVR-Dialog: A diagnostic dataset for Visual Dialog☆45Updated 4 years ago
- Transfer Learning via Unsupervised Task Discovery for Visual Question Answering☆19Updated 5 years ago
- Referring Expression Generation using Neural Networks☆22Updated 2 years ago
- Variational autoencoder in Theano☆12Updated 7 years ago
- Code for the paper "Unsupervised Learning from Narrated Instruction Videos", CVPR2016☆19Updated 8 years ago
- Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering☆24Updated 4 years ago
- Models and Codes for the paper Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions☆14Updated 6 years ago
- Transfer Learning via Unsupervised Task Discovery for Visual Question Answering☆31Updated 5 years ago
- Using embedding-based loss functions for phonetics/speech recognition.☆17Updated 10 years ago
- Localize objects in images using referring expressions☆36Updated 8 years ago
- The good practice in the VQA system such as pos-tag attention, structed triplet learning and triplet attention is very general and can be…☆19Updated 7 years ago
- An unofficial PyTorch implementation of the HAN and AdaHAN models presented in the "Learning Visual Question Answering by Bootstrapping H…☆54Updated 6 years ago
- Analogs of Linguistic Structure in Deep Representations☆19Updated 7 years ago
- Project Uncovering Temporal Context for Video Question and Answering☆14Updated 8 years ago