malmaud / whats_cookin
Dataset generated by the methods in "What's Cookin'? Interpreting Cooking Videos using Text, Speech and Vision"
☆20Updated 9 years ago
Related projects: ⓘ
- ☆11Updated 7 years ago
- Visual Storytelling API☆33Updated 7 years ago
- Code for reproducing the results in "Mining Semantic Affordances of Visual Object Categories"☆10Updated 3 months ago
- ☆24Updated 7 years ago
- Website for TextVQA dataset.☆28Updated last year
- ☆11Updated 7 years ago
- Code for Unsupervised Discovery of Multimodal Links in Multi-Image/Multi-Sentence Documents☆30Updated 4 years ago
- Models and Codes for the paper Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions☆15Updated 6 years ago
- Scripts to generate the CoDraw and i-CLEVR datasets used for the GeNeVA task proposed in our ICCV 2019 paper "Tell, Draw, and Repeat: Gen…☆37Updated last year
- The good practice in the VQA system such as pos-tag attention, structed triplet learning and triplet attention is very general and can be…☆20Updated 6 years ago
- Generate a denotation graph from a set of image captions☆15Updated 6 years ago
- Multi-Target Embodied Question Answering☆25Updated 4 years ago
- ☆12Updated this week
- Code to replicate "Generating Visual Explanations"☆47Updated 3 years ago
- Code for the paper "Unsupervised Learning from Narrated Instruction Videos", CVPR2016☆19Updated 8 years ago
- Implement Natural Language Object Retrieval in tensorflow☆36Updated 7 years ago
- Concreteness☆19Updated last year
- annotated screenplays for 39 CSI:Crime Scene Investigation episodes for paper "Whodunnit? Crime Drama as a Case for Natural Language Unde…☆46Updated 4 years ago
- Referring expression comprehension on ReferIt(RefClef)☆10Updated 7 years ago
- Weakly-supervised action segmentation in video☆16Updated 2 years ago
- Visual Verb Sense Disambiguation☆13Updated 5 years ago
- Random memory adaptation model inspired by the paper: "Memory-based parameter adaptation (MbPA)"☆24Updated 6 years ago
- Code for the COG dataset and network☆42Updated 5 years ago
- Repository to generate CLEVR-Dialog: A diagnostic dataset for Visual Dialog☆44Updated 4 years ago
- Cornell House Agent Learning Environment☆46Updated 2 years ago
- Run Pytorch graphs inside Theano graph (and pytorch wrapper for AIS for generative models).☆18Updated 6 years ago
- a list of recent papers on transfer learning☆24Updated 6 years ago
- Visual Navigation with Natural Multimodal Assistance (EMNLP 2019)☆27Updated 4 years ago
- An implementation of the NAACL'18 paper "Punny Captions: Witty Wordplay in Image Descriptions".☆33Updated 6 years ago
- Analogs of Linguistic Structure in Deep Representations☆19Updated 7 years ago