SatyamGaba / image_captioningLinks
Image Captioning with CNN, LSTM and RNN using PyTorch on COCO Dataset
☆18Updated 5 years ago
Alternatives and similar repositories for image_captioning
Users that are interested in image_captioning are comparing it to the libraries listed below
Sorting:
- ☆69Updated 4 years ago
- CNN LSTM architecture implemented in Pytorch for Video Classification☆295Updated 2 years ago
- Video Swin Transformer - PyTorch☆264Updated 3 years ago
- Pytorch Implementation of AlexNet☆213Updated 2 years ago
- Transformer & CNN Image Captioning model in PyTorch.☆44Updated 2 years ago
- Pytorch ViT for Image classification on the CIFAR10 dataset☆45Updated 3 years ago
- [ICASSP 2023] Official Implementation of ViTASD: Robust Vision Transformer Baselines for Autism Spectrum Disorder Facial Diagnosis☆28Updated 2 years ago
- PyTorch implementation of a collections of scalable Video Transformer Benchmarks.☆303Updated 3 years ago
- Action recognition tutorial using UCF-101 dataset.☆28Updated 3 years ago
- Basic implementation of ResNet 50, 101, 152 in PyTorch☆116Updated 3 years ago
- Efficient dual attention SlowFast networks for video action recognition☆24Updated 3 years ago
- Official implementation of the AAAI2024 paper: Open-Set Facial Expression Recognition☆32Updated last year
- Make video classification on UCF101 using CNN and RNN based on Pytorch framework.☆63Updated last year
- ☆72Updated 3 years ago
- Using ResNet3D to train on Kinetics form scratch or fine-tune on UCF-101(or others) with Kinetics pretrained model.☆29Updated 5 years ago
- Code for the paper: "Efficient Two-Stream Network for Violence Detection Using Separable Convolutional LSTM"☆61Updated last year
- Squeeze and Excitation network implementation.☆18Updated 6 years ago
- Image Captioning Vision Transformers (ViTs) are transformer models that generate descriptive captions for images by combining the power o…☆35Updated last year
- Based on our paper "Implementing vision transformer for classifying 2D biomedical images" published in Scientific Reports (Nature)☆13Updated last year
- fourierer / Video_Classification_ResNet3D_R2plus1D_ip-CSN_train-UCF101-HMDB51-Kinetics400-from-scratchUsing ResNet3D-50,R(2+1)D-50, and ip_CSN-50 to train UCD-101,HMDB-51 and Kinetics-400 from scratch.☆28Updated 5 years ago
- Exploring the applicability of Grad-CAM for explanation in video based dataset☆32Updated 2 years ago
- [arXiv 2024] PyTorch implementation of RRD: https://arxiv.org/abs/2407.12073☆13Updated 7 months ago
- Pytorch implementation of "Real-time Convolutional Neural Networks for Emotion and Gender Classification" (mini-Xception)☆66Updated 4 years ago
- ☆46Updated 4 years ago
- Implementation of Vision Transformer from scratch and performance compared to standard CNNs (ResNets) and pre-trained ViT on CIFAR10 and …☆114Updated last year
- In this repository, a simple implementation of Video augmentation is provided to augment videos for machine learning training tasks.☆20Updated 10 months ago
- [ACM MM '24 Poster] Official repository of paper titled "Towards Robustness Prompt Tuning with Fully Test-Time Adaptation for CLIP’s Zero…☆10Updated last year
- Implementation of ViViT: A Video Vision Transformer☆551Updated 4 years ago
- EfficientNetV2 pytorch (pytorch lightning) implementation with pretrained model☆81Updated 2 years ago
- action recognition; video classification; LRCN; I3D☆15Updated 4 years ago