torralba-lab / im2recipe-Pytorch
im2recipe Pytorch implementation
☆267Updated 6 months ago
Related projects: ⓘ
- Code supporting the CVPR 2017 paper "Learning Cross-modal Embeddings for Cooking Recipes and Food Images"☆370Updated 7 years ago
- Retrieve recipes from foodie pictures using Deep Learning and Pytorch☆52Updated 3 years ago
- Conceptual Captions is a dataset containing (image-URL, caption) pairs designed for the training and evaluation of machine learned image …☆513Updated 3 years ago
- Vision-Language Pre-training for Image Captioning and Question Answering☆409Updated 2 years ago
- Recognition to Cognition Networks (code for the model in "From Recognition to Cognition: Visual Commonsense Reasoning", CVPR 2019)☆464Updated 3 years ago
- ☆470Updated last year
- Learning Cross-Modal Embeddings with Adversarial Networks for Cooking Recipes and Food Images☆55Updated 5 years ago
- Recipe Generation from Food Images☆617Updated 4 years ago
- A Dataset for Grounded Video Description☆158Updated 2 years ago
- deep learning, image retrieval, vision and language☆296Updated 3 years ago
- ☆142Updated 2 years ago
- Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"☆526Updated last year
- ☆187Updated 2 years ago
- Automatic image captioning model based on Caffe, using features from bottom-up attention.☆243Updated last year
- A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning☆579Updated 3 years ago
- Code for Unsupervised Image Captioning☆215Updated last year
- Oscar and VinVL☆1,037Updated last year
- Starter code in PyTorch for the Visual Dialog challenge☆192Updated last year
- Transformer-based image captioning extension for pytorch/fairseq☆313Updated 3 years ago
- 🖼️ Attend to You: Personalized Image Captioning with Context Sequence Memory Networks. In CVPR, 2017. Expanded : Towards Personalized Im…☆209Updated 3 years ago
- Pytorch code of for our CVPR 2018 paper "Neural Baby Talk"☆523Updated 5 years ago
- [CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations☆556Updated 8 months ago
- PyTorch bottom-up attention with Detectron2☆229Updated 2 years ago
- [EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering☆169Updated last year
- PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"☆488Updated 2 years ago
- Implementation for the paper "Compositional Attention Networks for Machine Reasoning" (Hudson and Manning, ICLR 2018)☆492Updated 3 years ago
- Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".☆735Updated last year
- The iMaterialist Fashion Attribute Dataset☆82Updated 3 years ago
- Supervised Multimodal Bitransformers for Classifying Images and Text☆243Updated 3 years ago
- Train embodied agents that can answer questions in environments☆294Updated last year