SatyamGaba / image_captioningLinks
Image Captioning with CNN, LSTM and RNN using PyTorch on COCO Dataset
☆17Updated 5 years ago
Alternatives and similar repositories for image_captioning
Users that are interested in image_captioning are comparing it to the libraries listed below
Sorting:
- Pytorch implementation of image captioning using transformer-based model.☆66Updated 2 years ago
- Image Captioning Vision Transformers (ViTs) are transformer models that generate descriptive captions for images by combining the power o…☆36Updated 8 months ago
- Image Captioning using CNN and Transformer.☆53Updated 3 years ago
- Simple image-captioning model using Flickr8K dataset☆15Updated 3 years ago
- Action recognition tutorial using UCF-101 dataset.☆26Updated 3 years ago
- Transformer & CNN Image Captioning model in PyTorch.☆44Updated 2 years ago
- ☆145Updated 3 years ago
- Code for the paper: "Efficient Two-Stream Network for Violence Detection Using Separable Convolutional LSTM"☆60Updated last year
- Using LSTM or Transformer to solve Image Captioning in Pytorch☆78Updated 3 years ago
- PyTorch implementation of Emotic CNN methodology to recognize emotions in images using context information.☆142Updated last year
- Deep Neural Networks for Video Classification☆48Updated 2 years ago
- In this repository, a simple implementation of Video augmentation is provided to augment videos for machine learning training tasks.☆21Updated 6 months ago
- An unofficial implementation of the CVPR 2020 paper Multimodal Categorization of Crisis Events in Social Media☆16Updated 3 years ago
- [ICASSP 2023] Official Implementation of ViTASD: Robust Vision Transformer Baselines for Autism Spectrum Disorder Facial Diagnosis☆26Updated 2 years ago
- ☆12Updated last year
- GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)☆193Updated 2 years ago
- Implemented 3 different architectures to tackle the Image Caption problem, i.e, Merged Encoder-Decoder - Bahdanau Attention - Transformer…☆40Updated 4 years ago
- ☆67Updated 4 years ago
- ☆19Updated 4 years ago
- action recognition; video classification; LRCN; I3D☆15Updated 3 years ago
- Exploring the applicability of Grad-CAM for explanation in video based dataset☆32Updated last year
- Code for the paper: Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation.☆32Updated last year
- Violence Detection tutorial using pre-trained CNN and LSTM☆28Updated 6 years ago
- This is a Deep learning project using Flickr8k dataset for CSE 475: Machine Learning☆17Updated 4 years ago
- Implementation of the paper CPTR : FULL TRANSFORMER NETWORK FOR IMAGE CAPTIONING☆30Updated 3 years ago
- Image Captioning using CNN+RNN Encoder-Decoder Architecture in PyTorch☆23Updated 4 years ago
- My implementation for the paper Context-Aware Emotion Recognition Networks☆29Updated 3 years ago
- This is the official repo of paper accepted in AAAI 2023 Oral.☆89Updated 2 years ago
- ☆20Updated 11 months ago
- 毕业设计:《基于CLIP模型的视频文本检索设计与实现》☆10Updated 11 months ago