A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning
☆25Sep 4, 2020Updated 5 years ago
Alternatives and similar repositories for mt-captioning
Users that are interested in mt-captioning are comparing it to the libraries listed below
Sorting:
- ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration☆57Jun 13, 2023Updated 2 years ago
- Deliberate Attention Networks for Image Captioning (AAAI 2019)☆11Sep 30, 2019Updated 6 years ago
- Deep Multimodal Neural Architecture Search☆29Nov 15, 2020Updated 5 years ago
- The pytorch implementation on “Fine-Grained Image Captioning with Global-Local Discriminative Objective”☆21Oct 17, 2019Updated 6 years ago
- A Pytorch implementation of the paper 'Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering'☆10Jan 20, 2020Updated 6 years ago
- A PyTorch reimplementation of bottom-up-attention models☆302Apr 7, 2022Updated 3 years ago
- Image Chinese Description Generation Based on Multi-level Selective Visual Semantic Attributes☆16Nov 2, 2021Updated 4 years ago
- Implementation of the Object Relation Transformer for Image Captioning☆180Sep 17, 2024Updated last year
- Official code for the paper "Self-Distillation for Few-Shot Image Captioning"☆16Mar 15, 2021Updated 5 years ago
- Source code for "Recurrent Fusion Network for Image Captioning".☆23Nov 24, 2018Updated 7 years ago
- Optimized code based on M2 for faster image captioning training☆21Nov 18, 2022Updated 3 years ago
- Poet: Product-oriented Video Captioner for E-commerce☆12Sep 21, 2020Updated 5 years ago
- Large-Scale Bidirectional Training for Zero-Shot Image Captioning☆21Feb 14, 2023Updated 3 years ago
- Implementation of "Encoraging LSTMs to Anticipate Actions Very Early", ICCV 2017☆19Mar 25, 2018Updated 7 years ago
- PyTorch implementation of Chinese image captioning on AI_challenger dataset☆34Dec 25, 2019Updated 6 years ago
- ☆30Oct 2, 2018Updated 7 years ago
- Starter code for the VMT task and challenge☆51Jul 29, 2020Updated 5 years ago
- CaMEL: Mean Teacher Learning for Image Captioning. ICPR 2022☆29Dec 1, 2022Updated 3 years ago
- The implementation of Text-guided Attention Model for Image Captioning☆21Nov 9, 2017Updated 8 years ago
- Code for paper "Adaptively Aligned Image Captioning via Adaptive Attention Time". NeurIPS 2019☆51Dec 18, 2019Updated 6 years ago
- Accompany code to reproduce the baselines of the International Multimodal Sentiment Analysis Challenge (MuSe 2020).☆16Dec 8, 2022Updated 3 years ago
- Training a BERT model from scratch.☆11Oct 15, 2023Updated 2 years ago
- ☆11May 18, 2022Updated 3 years ago
- PyTorch implementation of Chinese image captioning on AI_challenger dataset☆13Sep 24, 2017Updated 8 years ago
- [ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset☆91Sep 6, 2023Updated 2 years ago
- Extension of hLSTMat☆19Apr 15, 2021Updated 4 years ago
- An VideoQA dataset based on the videos from ActivityNet☆91Nov 22, 2020Updated 5 years ago
- ☆37Jan 5, 2018Updated 8 years ago
- Code for paper "Attention on Attention for Image Captioning". ICCV 2019☆339May 2, 2021Updated 4 years ago
- cnn bilstm crf 作中文命名实体识别☆13Sep 25, 2020Updated 5 years ago
- ☆15Jul 23, 2019Updated 6 years ago
- Code accompanying the paper "Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs" (Chen et al., …☆200Dec 1, 2022Updated 3 years ago
- Code for ViLBERTScore in EMNLP Eval4NLP☆18Oct 27, 2022Updated 3 years ago
- A pytorch implementation of Attention Is All You Need (Transformer) for image captioning.☆12Nov 15, 2021Updated 4 years ago
- A lightweight, scalable, and general framework for visual question answering research☆331Sep 3, 2021Updated 4 years ago
- Grid features pre-training code for visual question answering☆269Sep 17, 2021Updated 4 years ago
- This is the implementation of self-CIDEr and LSA-based diversity metrics (only for python 2.7).☆36Feb 26, 2022Updated 4 years ago
- dataset cleansing for Visual Genome☆30Apr 26, 2017Updated 8 years ago
- An image-oriented evaluation tool for image captioning systems (EMNLP-IJCNLP 2019)☆37May 3, 2020Updated 5 years ago