akashe / Multimodal-action-recognitionLinks

Code on selecting an action based on multimodal inputs. Here in this case inputs are voice and text.

☆73

Alternatives and similar repositories for Multimodal-action-recognition

Users that are interested in Multimodal-action-recognition are comparing it to the libraries listed below

Sorting:

yanbeic / CCL
PyTorch Implementation on Paper [CVPR2021]Distilling Audio-Visual Knowledge by Compositional Contrastive Learning
☆89Updated 4 years ago
WasifurRahman / BERT_multimodal_transformer
☆206Updated 3 years ago
zinengtang / TVLT
PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)
☆125Updated 2 years ago
maysonma / VAANet
[AAAI 2020] Official implementation of VAANet for Emotion Recognition
☆78Updated last year
jbdel / modulated_fusion_transformer
Modulated Fusion using Transformer for Linguistic-Acoustic Emotion Recognition
☆30Updated 4 years ago
praveena2j / Cross-Attentional-AV-Fusion
FG2021: Cross Attentional AV Fusion for Dimensional Emotion Recognition
☆32Updated 8 months ago
jedyang97 / MTAG
Code for NAACL 2021 paper: MTAG: Modal-Temporal Attention Graph for Unaligned Human Multimodal Language Sequences
☆43Updated 2 years ago
anita-hu / MSAF
Offical implementation of paper "MSAF: Multimodal Split Attention Fusion"
☆82Updated 4 years ago
HumamAlwassel / XDC
Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)
☆90Updated 2 years ago
ninatu / everything_at_once
Official implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval." CVPR 2022
☆109Updated 3 years ago
declare-lab / BBFN
This repository contains the implementation of the paper -- Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment An…
☆72Updated 2 years ago
roudimit / AVLnet
Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.
☆52Updated 3 years ago
MANLP-suda / HHMPN
Multi-modal Multi-label Emotion Recognition with Heterogeneous Hierarchical Message Passing
☆17Updated 2 years ago
kniter1 / TAILOR
Pytorch implementation for Tailor Versatile Multi-modal Learning for Multi-label Emotion Recognition
☆60Updated 2 years ago
affect2mm / emotion-timeseries
☆16Updated 4 years ago
ammesatyajit / VideoBERT
Using VideoBERT to tackle video prediction
☆130Updated 4 years ago
IBM / AdaMML
Official implementation of AdaMML. https://arxiv.org/abs/2105.05165.
☆51Updated 3 years ago
v-iashin / MDVC
PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
☆144Updated 2 years ago
ExplainableML / AVCA-GZSL
This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and …
☆37Updated 2 years ago
sunlightsgy / MEmoR
Code and dataset of "MEmoR: A Dataset for Multimodal Emotion Reasoning in Videos" in MM'20.
☆54Updated 2 years ago
declare-lab / Multimodal-Infomax
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information M…
☆190Updated 2 years ago
drv-agwl / ViViT-pytorch
☆68Updated 4 years ago
yunyikristy / CM-ACC
Cross-model active contrastive coding
☆22Updated 4 years ago
PALMJJ / Multimodal-short-video-classification
Multimodal short video classification task, integrating video, image, audio and text modes for short video classification
☆19Updated 5 years ago
JingyuanYY / EmoSet
This is the official implementation of 2023 ICCV paper "EmoSet: A large-scale visual emotion dataset with rich attributes".
☆50Updated last year
codezakh / exploiting-BERT-thru-translation
[ACM MM 2021 Oral] Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation"
☆40Updated 4 years ago
junchen14 / Multi-Modal-Transformer
The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-languag…
☆229Updated 2 years ago
jasongief / PSP_CVPR_2021
[2021 CVPR] Positive Sample Propagation along the Audio-Visual Event Line
☆42Updated 3 years ago
praveena2j / JointCrossAttentional-AV-Fusion
ABAW3 (CVPRW): A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition
☆46Updated last year
EvelynFan / AWESOME-MER
🔆 📝 A reading list focused on Multimodal Emotion Recognition (MER) 👂👄 👀 💬
☆120Updated 4 years ago