WujiangXu / MHSCNetLinks

The code for ICASSP23 paper "MHSCNet: A Multimodal Hierarchical Shot-aware Convolutional Network for Video Summarization"

☆10

Alternatives and similar repositories for MHSCNet

Users that are interested in MHSCNet are comparing it to the libraries listed below

Sorting:

nchucvml / STVT
Video Summarization With Spatiotemporal Vision Transformer
☆21Updated 2 years ago
e-apostolidis / CA-SUM
A PyTorch Implementation of CA-SUM from "Summarizing Videos using Concentrated Attention and Considering the Uniqueness and Diversity of …
☆30Updated 3 years ago
boheumd / A2Summ
The official implementation of 'Align and Attend: Multimodal Summarization with Dual Contrastive Losses' (CVPR 2023)
☆76Updated 2 years ago
ufal / MLASK
EACL 2023 paper "MLASK: Multimodal Summarization of Video-based News Articles"
☆12Updated last year
HopLee6 / SSPVS-PyTorch
Pytorch implementation for "Progressive Video Summarization via Multimodal Self-supervised Learning"
☆34Updated 2 years ago
e-apostolidis / PGL-SUM
A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization" (IEEE IS…
☆89Updated 2 years ago
phaphuang / DSR-RL
Pytorch implementation of DSR-RL for Video Summarization Task
☆12Updated 3 years ago
jylins / videoxum
[TMM 2023] VideoXum: Cross-modal Visual and Textural Summarization of Videos
☆45Updated last year
Roc-Ng / HANet
PyTorch implementation of HANet: Hierarchical Alignment Networks for Video-Text Retrieval (ACM MM 2021).
☆47Updated 3 years ago
AndresPMD / semantic_adaptive_margin
WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching
☆16Updated 3 years ago
lzp870 / RSFD
☆8Updated last year
ruc-aimc-lab / LAFF
[ECCV 2022] LAFF for Text-to-Video Retrieval
☆45Updated last year
TIBHannover / UnsupervisedVideoSummarization
Source code for the paper "Unsupervised Video Summarization via Multi-source Features" published at ICMR 2021
☆21Updated 3 years ago
shuoyang129 / eamat
Entity-Aware and Motion-Aware Transformers for Language-driven Action Localization(IJCAI-22)
☆12Updated 2 years ago
yanbeic / CCL
PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning
☆89Updated 4 years ago
YuanEZhou / CBTrans
☆22Updated 3 years ago
yangbang18 / CARE
(TIP'2023) Concept-Aware Video Captioning: Describing Videos with Effective Prior Information
☆29Updated 6 months ago
Sejong-VLI / V2T-Action-Graph-JKSUCIS-2023
The implementation of a paper entitled "Action Knowledge for Video Captioning with Graph Neural Networks" (JKSUCIS 2023).
☆12Updated 2 years ago
JustinYuu / MACIL_SD
[ACM MM 2022] Modality-aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection
☆37Updated 3 years ago
guoshengcv / CACL
[CVPR 2022] Cross-Architecture Self-supervised Video Representation Learning
☆24Updated 3 years ago
praveena2j / Cross-Attentional-AV-Fusion
FG2021: Cross Attentional AV Fusion for Dimensional Emotion Recognition
☆30Updated 7 months ago
tanghaoyu258 / ACRM-for-moment-retrieval
☆27Updated 2 years ago
ylqi / GL-RG
The code of IJCAI22 paper "GL-RG: Global-Local Representation Granularity for Video Captioning".
☆19Updated 2 years ago
huangmozhi9527 / ConMH
[AAAI 2023] Contrastive Masked Autoencoders for Self-Supervised Video Hashing
☆27Updated 2 years ago
HuiGuanLab / ms-sl
Source code of our MM'22 paper Partially Relevant Video Retrieval
☆53Updated 8 months ago
artelab / Image-and-Text-fusion-for-UPMC-Food-101-using-BERT-and-CNNs
☆61Updated 4 years ago
praveena2j / RecurrentJointAttentionwithLSTMs
ICASSP 2023: "Recursive Joint Attention for Audio-Visual Fusion in Regression Based Emotion Recognition"
☆12Updated 7 months ago
v-iashin / MDVC
PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
☆143Updated 2 years ago
ninatu / everything_at_once
Official implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval." CVPR 2022
☆108Updated 3 years ago
medhini / Instructional-Video-Summarization
Code for paper, "TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency" ECCV 2022
☆38Updated 2 years ago