josharnoldjosh / Image-Caption-Joint-EmbeddingLinks

A multimodal embedding of images and captions, built with PyTorch, written with Python3.

☆30

Alternatives and similar repositories for Image-Caption-Joint-Embedding

Users that are interested in Image-Caption-Joint-Embedding are comparing it to the libraries listed below

Sorting:

lukemelas / image-paragraph-captioning
[EMNLP 2018] Training for Diversity in Image Paragraph Captioning
☆89Updated 5 years ago
Dong-JinKim / DenseRelationalCaptioning
Code of Dense Relational Captioning
☆69Updated 2 years ago
aioz-ai / ICCV19_VQA-CTI
Compact Trilinear Interaction for Visual Question Answering (ICCV 2019)
☆38Updated 2 years ago
tbmoon / basic_vqa
Pytorch VQA : Visual Question Answering (https://arxiv.org/pdf/1505.00468.pdf)
☆95Updated last year
poojahira / image-captioning-bottom-up-top-down
PyTorch implementation of Image captioning with Bottom-up, Top-down Attention
☆166Updated 6 years ago
husthuaan / AoANet
Code for paper "Attention on Attention for Image Captioning". ICCV 2019
☆333Updated 4 years ago
danieljf24 / w2vv
Word2VisualVec : Predicting Visual Features from Text for Image and Video Caption Retrieval
☆69Updated 5 years ago
yahoo / object_relation_transformer
Implementation of the Object Relation Transformer for Image Captioning
☆178Updated 10 months ago
violetteshev / bottom-up-features
Bottom-up features extractor implemented in PyTorch.
☆72Updated 5 years ago
asdf0982 / vqa-mfb.pytorch
This project is out of date, I don't remember the details inside...
☆84Updated 7 years ago
Cadene / block.bootstrap.pytorch
BLOCK (AAAI 2019), with a multimodal fusion library for deep learning models
☆354Updated 5 years ago
pliang279 / factorized
[ICLR 2019] Learning Factorized Multimodal Representations
☆67Updated 5 years ago
hobincar / pytorch-video-feature-extractor
A repository for extract CNN features from videos using pytorch
☆70Updated 2 years ago
fawazsammani / show-edit-tell
Show, Edit and Tell: A Framework for Editing Image Captions, CVPR 2020
☆80Updated 5 years ago
parksunwoo / show_attend_and_tell_pytorch
Pytorch implement Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
☆95Updated 6 years ago
ajamjoom / Image-Captions
BERT + Image Captioning
☆132Updated 4 years ago
aimbrain / vqa-project
Code for our paper: Learning Conditioned Graph Structures for Interpretable Visual Question Answering
☆149Updated 6 years ago
JDAI-CV / image-captioning
Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]
☆274Updated 4 years ago
johnarevalo / gmu-mmimdb
Source code for training Gated Multimodal Units on MM-IMDb dataset
☆95Updated 2 years ago
yalesong / pvse
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval (CVPR 2019)
☆134Updated last year
husthuaan / AAT
Code for paper "Adaptively Aligned Image Captioning via Adaptive Attention Time". NeurIPS 2019
☆50Updated 5 years ago
Cyanogenoid / pytorch-vqa
Strong baseline for visual question answering
☆240Updated 2 years ago
MILVLG / mt-captioning
A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning
☆25Updated 4 years ago
fenglinliu98 / MIA
Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" （NeurIPS 2019）
☆65Updated 4 years ago
Gitsamshi / WeakVRD-Captioning
Implementation of paper "Improving Image Captioning with Better Use of Caption"
☆33Updated 4 years ago
Shivanshu-Gupta / Visual-Question-Answering
CNN+LSTM, Attention based, and MUTAN-based models for Visual Question Answering
☆75Updated 5 years ago
andyweizhao / Multitask_Image_Captioning
☆22Updated 6 years ago
hwang1996 / ACME
Learning Cross-Modal Embeddings with Adversarial Networks for Cooking Recipes and Food Images
☆58Updated 6 years ago
Wangt-CN / MTFN-RR-PyTorch-Code
The offical code for paper "Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking", ACM Multimedia 2019 Oral
☆68Updated 5 years ago
yikuan8 / Transformers-VQA
An implementation that downstreams pre-trained V+L models to VQA tasks. Now support: VisualBERT, LXMERT, and UNITER
☆164Updated 2 years ago