Shreyz-max / Video-CaptioningLinks

Video Captioning is an encoder decoder mode based on sequence to sequence learning

☆137

Alternatives and similar repositories for Video-Captioning

Users that are interested in Video-Captioning are comparing it to the libraries listed below

Sorting:

v-iashin / BMT
Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)
☆227Updated 2 years ago
simon-ging / coot-videotext
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
☆289Updated 2 years ago
scopeInfinity / Video2Description
Video to Text: Natural language description generator for some given video. [Video Captioning]
☆350Updated 3 years ago
hobincar / pytorch-video-feature-extractor
A repository for extract CNN features from videos using pytorch
☆70Updated 2 years ago
v-iashin / MDVC
PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
☆143Updated 2 years ago
facebookresearch / grounded-video-description
Video Grounding and Captioning
☆326Updated 3 years ago
antoine77340 / video_feature_extractor
Easy to use video deep features extractor
☆319Updated 5 years ago
saahiluppal / catr
Image Captioning Using Transformer
☆268Updated 3 years ago
ok1zjf / VASNet
PyTorch implementation of the ACCV 2018-AIU2018 paper Video Summarization with Attention
☆183Updated 3 years ago
e-apostolidis / PGL-SUM
A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization" (IEEE IS…
☆89Updated 2 years ago
albanie / collaborative-experts
Video embeddings for retrieval with natural language queries
☆342Updated 2 years ago
li-plus / DSNet
DSNet: A Flexible Detect-to-Summarize Network for Video Summarization
☆218Updated 3 years ago
shruti-jadon / Video-Summarization-using-Keyframe-Extraction-and-Video-Skimming
Experimenting with different Summarizing techniques on SumMe Dataset
☆139Updated 5 years ago
ammesatyajit / VideoBERT
Using VideoBERT to tackle video prediction
☆130Updated 4 years ago
nasib-ullah / video-captioning-models-in-Pytorch
A PyTorch implementation of state of the art video captioning models from 2015-2019 on MSVD and MSRVTT datasets.
☆73Updated last year
vijayvee / video-captioning
This repository contains the code for a video captioning system inspired by Sequence to Sequence -- Video to Text. This system takes as i…
☆166Updated 5 years ago
tanishqgautam / Image-Captioning
Implemented 3 different architectures to tackle the Image Caption problem, i.e, Merged Encoder-Decoder - Bahdanau Attention - Transformer…
☆40Updated 4 years ago
TIBHannover / MSVA
Deep learning model for supervised video summarization called Multi Source Visual Attention (MSVA)
☆45Updated last year
v-iashin / video_features
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and T…
☆605Updated 5 months ago
dabasajay / Image-Caption-Generator
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
☆305Updated 4 years ago
jayleicn / moment_detr
[NeurIPS 2021] Moment-DETR code and QVHighlights dataset
☆321Updated last year
microsoft / SwinBERT
Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"
☆240Updated 3 years ago
TIBHannover / UnsupervisedVideoSummarization
Source code for the paper "Unsupervised Video Summarization via Multi-source Features" published at ICMR 2021
☆21Updated 3 years ago
nchucvml / STVT
Video Summarization With Spatiotemporal Vision Transformer
☆21Updated 2 years ago
ttengwang / PDVC
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
☆219Updated last year
ihababdelkareem / video-summarization
Key-frame based summarization of videos
☆27Updated 2 years ago
ttengwang / dense-video-captioning-pytorch
Second-place solution to dense video captioning task in ActivityNet Challenge (CVPR 2020 workshop)
☆75Updated 3 years ago
microsoft / UniVL
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
☆358Updated 11 months ago
jayleicn / TVRetrieval
[ECCV 2020] PyTorch code for XML on TVRetrieval dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
☆160Updated last year
Axe-- / ActionBERT
Transformer for Action Recognition in PyTorch
☆38Updated 5 years ago