MILVLG/mt-captioning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MILVLG/mt-captioning)

MILVLG / mt-captioning

A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning

☆25

Alternatives and similar repositories for mt-captioning

Users that are interested in mt-captioning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MILVLG / rosita
View on GitHub
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
☆57Jun 13, 2023Updated 3 years ago
fkxssaa / Deliberate-Attention-Networks-for-Image-Captioning
View on GitHub
Deliberate Attention Networks for Image Captioning (AAAI 2019)
☆11Sep 30, 2019Updated 6 years ago
MILVLG / mmnas
View on GitHub
Deep Multimodal Neural Architecture Search
☆29Nov 15, 2020Updated 5 years ago
WuJie1010 / Fine-Grained-Image-Captioning
View on GitHub
The pytorch implementation on “Fine-Grained Image Captioning with Global-Local Discriminative Objective”
☆21Oct 17, 2019Updated 6 years ago
ayouboumani / image-captioning-with-attention
View on GitHub
A Pytorch implementation of the paper 'Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering'
☆10Jan 20, 2020Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
MILVLG / bottom-up-attention.pytorch
View on GitHub
A PyTorch reimplementation of bottom-up-attention models
☆301Apr 7, 2022Updated 4 years ago
ShemoonX / Chinese-image-caption
View on GitHub
Image Chinese Description Generation Based on Multi-level Selective Visual Semantic Attributes
☆16Nov 2, 2021Updated 4 years ago
yahoo / object_relation_transformer
View on GitHub
Implementation of the Object Relation Transformer for Image Captioning
☆180Sep 17, 2024Updated last year
cswhjiang / Recurrent_Fusion_Network
View on GitHub
Source code for "Recurrent Fusion Network for Image Captioning".
☆23Nov 24, 2018Updated 7 years ago
shengyuzhang / Poet
View on GitHub
Poet: Product-oriented Video Captioner for E-commerce
☆12Sep 21, 2020Updated 5 years ago
luo3300612 / Transformer-Captioning
View on GitHub
Optimized code based on M2 for faster image captioning training
☆21Nov 18, 2022Updated 3 years ago
mangalutsav / Multi-Stage-LSTM-for-Action-Anticipation
View on GitHub
Implementation of "Encoraging LSTMs to Anticipate Actions Very Early", ICCV 2017
☆19Mar 25, 2018Updated 8 years ago
showkeyjar / chinese_im2text.pytorch
View on GitHub
PyTorch implementation of Chinese image captioning on AI_challenger dataset
☆34Dec 25, 2019Updated 6 years ago
tgGuo15 / PriorImageCaption
View on GitHub
☆30Oct 2, 2018Updated 7 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
eric-xw / Video-guided-Machine-Translation
View on GitHub
Starter code for the VMT task and challenge
☆51Jul 29, 2020Updated 5 years ago
JonghwanMun / TextguidedATT
View on GitHub
The implementation of Text-guided Attention Model for Image Captioning
☆21Nov 9, 2017Updated 8 years ago
lstappen / MuSe2020
View on GitHub
Accompany code to reproduce the baselines of the International Multimodal Sentiment Analysis Challenge (MuSe 2020).
☆16Dec 8, 2022Updated 3 years ago
ARiSE-Lab / CYCLE_OOPSLA_24
View on GitHub
Open-source repository for the OOPSLA'24 paper "CYCLE: Learning to Self-Refine Code Generation"
☆10Mar 8, 2024Updated 2 years ago
webYFDT / hateful
View on GitHub
☆11May 18, 2022Updated 4 years ago
husthuaan / AAT
View on GitHub
Code for paper "Adaptively Aligned Image Captioning via Adaptive Attention Time". NeurIPS 2019
☆50Dec 18, 2019Updated 6 years ago
chenghuige / chinese_im2text.pytorch
View on GitHub
PyTorch implementation of Chinese image captioning on AI_challenger dataset
☆13Sep 24, 2017Updated 8 years ago
jayleicn / TVCaption
View on GitHub
[ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset
☆91Sep 6, 2023Updated 2 years ago
lixiangpengcs / Spatial-Temporal-Adaptive-Attention-for-Video-Captioning
View on GitHub
Extension of hLSTMat
☆19Apr 15, 2021Updated 5 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
MILVLG / activitynet-qa
View on GitHub
An VideoQA dataset based on the videos from ActivityNet
☆94Nov 22, 2020Updated 5 years ago
ruotianluo / bottom-up-attention-ai-challenger
View on GitHub
☆37Jan 5, 2018Updated 8 years ago
husthuaan / AoANet
View on GitHub
Code for paper "Attention on Attention for Image Captioning". ICCV 2019
☆339May 2, 2021Updated 5 years ago
DataScienceNigeria / Efficient-Video-Generation-on-Complex-Datasets
View on GitHub
☆15Jul 23, 2019Updated 6 years ago
rxy007 / cnn-lstm-crf
View on GitHub
cnn bilstm crf 作中文命名实体识别
☆13Sep 25, 2020Updated 5 years ago
GauravGajbhiye / SCAMET_RSIC
View on GitHub
This is tensorflow 2.2 based SCAMET framework for remote sensing image captioning.
☆13Aug 10, 2023Updated 2 years ago
sverma88 / DeepCU-IJCAI19
View on GitHub
DeepCU: Integrating Both Common and Unique Latent Information for Multimodal Sentiment Analysis, IJCAI-19
☆19Nov 21, 2019Updated 6 years ago
ezeli / Transformer_model
View on GitHub
A pytorch implementation of Attention Is All You Need (Transformer) for image captioning.
☆12Nov 15, 2021Updated 4 years ago
MILVLG / openvqa
View on GitHub
A lightweight, scalable, and general framework for visual question answering research
☆333Sep 3, 2021Updated 4 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
cshizhe / asg2cap
View on GitHub
Code accompanying the paper "Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs" (Chen et al., …
☆200Dec 1, 2022Updated 3 years ago
facebookresearch / grid-feats-vqa
View on GitHub
Grid features pre-training code for visual question answering
☆269Sep 17, 2021Updated 4 years ago
yikang-li / vg_cleansing
View on GitHub
dataset cleansing for Visual Genome
☆30Apr 26, 2017Updated 9 years ago
qingzwang / DiversityMetrics
View on GitHub
This is the implementation of self-CIDEr and LSA-based diversity metrics (only for python 2.7).
☆37Feb 26, 2022Updated 4 years ago
SeleenaJM / CapEval
View on GitHub
An image-oriented evaluation tool for image captioning systems (EMNLP-IJCNLP 2019)
☆37May 3, 2020Updated 6 years ago
Wentong-DST / up-down-captioner
View on GitHub
Caffe implementation of paper: "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering"
☆29Oct 24, 2018Updated 7 years ago
soloist97 / densecap-pytorch
View on GitHub
A simplified pytorch version of densecap
☆43Dec 11, 2024Updated last year