IsaacRodgz / multimodal-transformers-moviesLinks
Experiments with multimodal deep learning models based on transformers
☆12Updated 2 years ago
Alternatives and similar repositories for multimodal-transformers-movies
Users that are interested in multimodal-transformers-movies are comparing it to the libraries listed below
Sorting:
- Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.☆73Updated 4 years ago
- ☆16Updated 4 years ago
- EACL 2023 paper "MLASK: Multimodal Summarization of Video-based News Articles"☆12Updated last year
- ☆31Updated 4 years ago
- Multimodal short video classification task, integrating video, image, audio and text modes for short video classification☆19Updated 5 years ago
- Multi-modal transformer approach for natural language query based joint video summarization and highlight detection☆16Updated last year
- Using VideoBERT to tackle video prediction☆132Updated 4 years ago
- Official implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval." CVPR 2022☆112Updated 3 years ago
- ☆56Updated 3 years ago
- PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)☆145Updated 2 years ago
- PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning☆89Updated 4 years ago
- The official implementation of 'Align and Attend: Multimodal Summarization with Dual Contrastive Losses' (CVPR 2023)☆79Updated 2 years ago
- PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)☆126Updated 2 years ago
- Video Summarization With Spatiotemporal Vision Transformer☆21Updated 2 years ago
- [TMLR 2022] High-Modality Multimodal Transformer☆117Updated 10 months ago
- [ACM MM 2021 Oral] Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation"☆40Updated 4 years ago
- In-the-wild Question Answering☆15Updated 2 years ago
- ☆18Updated 2 years ago
- Condensed Movies Challenge 2021☆19Updated 3 years ago
- Generalized cross-modal NNs; new audiovisual benchmark (IEEE TNNLS 2019)☆28Updated 5 years ago
- Easiest way of fine-tuning HuggingFace video classification models☆145Updated 2 years ago
- [TMM 2023] VideoXum: Cross-modal Visual and Textural Summarization of Videos☆49Updated last year
- Graph learning framework for long-term video understanding☆66Updated 2 months ago
- Code for the Video Similarity Challenge.☆80Updated last year
- Official Implementation of "Geometric Multimodal Contrastive Representation Learning" (https://arxiv.org/abs/2202.03390)☆28Updated 8 months ago
- Source code for the AAAI 2021 paper "Movie Summarization via Sparse Graph Construction"☆32Updated 4 years ago
- Multimodal classification solution for the SIGIR eCOM using Co-attention and transformer language models☆19Updated 5 years ago
- [CVPR 2024] MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos☆34Updated 7 months ago
- CM-BERT: Cross-Modal BERT for Text-Audio Sentiment Analysis(MM2020)☆114Updated 4 years ago
- A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization" (IEEE IS…☆90Updated 2 years ago