IsaacRodgz / multimodal-transformers-moviesLinks
Experiments with multimodal deep learning models based on transformers
☆12Updated 2 years ago
Alternatives and similar repositories for multimodal-transformers-movies
Users that are interested in multimodal-transformers-movies are comparing it to the libraries listed below
Sorting:
- EACL 2023 paper "MLASK: Multimodal Summarization of Video-based News Articles"☆10Updated last year
- Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.☆73Updated 4 years ago
- ☆16Updated 4 years ago
- ☆55Updated 2 years ago
- Multimodal short video classification task, integrating video, image, audio and text modes for short video classification☆19Updated 5 years ago
- PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning☆87Updated 3 years ago
- EDUVSUM is a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features to identify important tempo…☆21Updated last year
- This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information M…☆186Updated 2 years ago
- Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"☆26Updated 4 months ago
- Generalized cross-modal NNs; new audiovisual benchmark (IEEE TNNLS 2019)☆27Updated 5 years ago
- Source code of our MM'22 paper Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning☆13Updated 2 years ago
- Official code repo for TCLR: Temporal Contrastive Learning for Video Representation [CVIU-2022]☆37Updated last year
- Using VideoBERT to tackle video prediction☆128Updated 4 years ago
- ☆18Updated last year
- Multimodal classification solution for the SIGIR eCOM using Co-attention and transformer language models☆19Updated 4 years ago
- Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.☆51Updated 3 years ago
- The official implementation of 'Align and Attend: Multimodal Summarization with Dual Contrastive Losses' (CVPR 2023)☆73Updated 2 years ago
- PyTorch implementation of the models described in the IEEE ICASSP 2022 paper "Is cross-attention preferable to self-attention for multi-m…☆59Updated 2 months ago
- ☆31Updated 3 years ago
- [TMM 2023] VideoXum: Cross-modal Visual and Textural Summarization of Videos☆45Updated last year
- Code for the paper: Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation.☆32Updated last year
- Multi-modal Multi-label Emotion Recognition with Heterogeneous Hierarchical Message Passing☆17Updated 2 years ago
- Code and dataset of "MEmoR: A Dataset for Multimodal Emotion Reasoning in Videos" in MM'20.☆54Updated last year
- CM-BERT: Cross-Modal BERT for Text-Audio Sentiment Analysis(MM2020)☆112Updated 4 years ago
- Official implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval." CVPR 2022☆106Updated 2 years ago
- Pytorch implementation for Tailor Versatile Multi-modal Learning for Multi-label Emotion Recognition☆60Updated 2 years ago
- Modulated Fusion using Transformer for Linguistic-Acoustic Emotion Recognition☆30Updated 4 years ago
- FG2021: Cross Attentional AV Fusion for Dimensional Emotion Recognition☆29Updated 6 months ago
- [CVPR 2024] MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos☆32Updated 4 months ago
- Code for the Video Similarity Challenge.☆79Updated last year