aita-lab / awesome-multimodal-fusion-emotion-recognitionLinks
Awesome Multimodal Fusion in Speech Emotion Recognition
☆13Updated 2 months ago
Alternatives and similar repositories for awesome-multimodal-fusion-emotion-recognition
Users that are interested in awesome-multimodal-fusion-emotion-recognition are comparing it to the libraries listed below
Sorting:
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆16Updated last year
- ☆17Updated last year
- Code for paper "Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition"☆19Updated 2 years ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Updated 2 years ago
- BLSP-Emo: Towards Empathetic Large Speech-Language Models☆56Updated last year
- Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dial…☆40Updated last year
- [ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"☆97Updated last year
- Official PyTorch implementation for "Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech …☆32Updated 8 months ago
- Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…☆35Updated 2 years ago
- [ACM MM 2023] Official PyTorch implementation of "Emo-DNA: Emotion Decoupling and Alignment Learning for Cross-Corpus Speech Emotion Reco…☆12Updated 2 years ago
- TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages☆18Updated last year
- Code for paper "MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recogni…☆16Updated 2 years ago
- MSP-Podcast Challenge Baseline Code for Interspeech 2025☆28Updated last year
- EMO-SUPERB submission☆50Updated 3 months ago
- ☆22Updated last year
- A Compact and Effective Pretrained Model for Speech Emotion Recognition☆53Updated last year
- Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"☆14Updated 11 months ago
- Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)☆20Updated 10 months ago
- Repository having the code and models from the paper: data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student traini…☆13Updated last year
- [ACII 2023] PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Spe…☆60Updated last year
- [ICASSP 2024] Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition☆26Updated last year
- ☆23Updated last month
- ☆14Updated last year
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!☆43Updated 2 years ago
- MSP-Podcast Challenge Baseline Code☆30Updated last year
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Updated 10 months ago
- Huggingface Implementation of AV-HuBERT on the MuAViC Dataset☆17Updated 10 months ago
- Towards a general language-audio model for computational paralinguistic tasks☆23Updated last year
- ☆24Updated last year
- Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners" [ICASSP 2025] and "Mitigat…☆56Updated 2 weeks ago