abdelhadie-almalla / image_captioningLinks
☆12Updated last year
Alternatives and similar repositories for image_captioning
Users that are interested in image_captioning are comparing it to the libraries listed below
Sorting:
- Image Captioning using CNN and Transformer.☆55Updated 3 years ago
- Implemented 3 different architectures to tackle the Image Caption problem, i.e, Merged Encoder-Decoder - Bahdanau Attention - Transformer…☆40Updated 4 years ago
- Visual Semantic Relatedness Dataset for Captioning. CVPRW 2023☆10Updated last year
- Pytorch implementation of image captioning using transformer-based model.☆67Updated 2 years ago
- Implemented Image Captioning Model using both Local and Global Attention Techniques and API'fied the model using FLASK☆26Updated 5 years ago
- Pytorch VQA : Visual Question Answering (https://arxiv.org/pdf/1505.00468.pdf)☆96Updated 2 years ago
- Multimodal Meme Classification: Identifying Offensive Content in Image and Text☆71Updated 2 years ago
- An implementation that downstreams pre-trained V+L models to VQA tasks. Now support: VisualBERT, LXMERT, and UNITER☆164Updated 2 years ago
- Visual Question Answering in PyTorch with various Attention Models☆20Updated 5 years ago
- Using LSTM or Transformer to solve Image Captioning in Pytorch☆79Updated 4 years ago
- ☆66Updated last year
- Pytorch implementation of VQA: Visual Question Answering (https://arxiv.org/pdf/1505.00468.pdf) using VQA v2.0 dataset for open-ended ta…☆20Updated 5 years ago
- ☆62Updated 4 years ago
- BERT + Image Captioning☆133Updated 4 years ago
- CNN+LSTM, Attention based, and MUTAN-based models for Visual Question Answering☆75Updated 5 years ago
- Image captioning models "show and tell" + "show, attend and tell" in PyTorch☆19Updated 7 years ago
- Transformer-based image captioning extension for pytorch/fairseq☆318Updated 4 years ago
- The LSTM model generates captions for the input images after extracting features from pre-trained VGG-16 model. (Computer Vision, NLP, De…☆89Updated 6 years ago
- Image Captioning Using Transformer☆269Updated 3 years ago
- An unofficial implementation of the CVPR 2020 paper Multimodal Categorization of Crisis Events in Social Media☆16Updated 3 years ago
- Simple image-captioning model using Flickr8K dataset☆15Updated 3 years ago
- A PyTorch implementation of the paper Show, Attend and Tell: Neural Image Caption Generation with Visual Attention☆84Updated 5 years ago
- Implementation code of the work "Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning"☆93Updated 8 months ago
- This is a Deep learning project using Flickr8k dataset for CSE 475: Machine Learning☆17Updated 4 years ago
- Python 3 support for the MS COCO caption evaluation tools☆324Updated last year
- Python 3 support for the MS COCO caption evaluation tools☆14Updated last year
- Multi-modal Sarcasm Detection and Humor Classification in Code-mixed Conversations☆12Updated 4 years ago
- Hyperparameter analysis for Image Captioning using LSTMs and Transformers☆26Updated last year
- ☆13Updated 4 years ago
- Vision-Language Pre-training for Image Captioning and Question Answering☆424Updated 3 years ago