Multimodal SER Model meant to be trained on recognising emotions from speech (text + acoustic data). Fine-tuned the DeBERTaV3 model, respectively the Wav2Vec2 model to extract the features and classify the emotions from the text, respectively audio data, then passed their features and their classification through an MLP to achieve better results…
☆11Jun 19, 2024Updated last year
Alternatives and similar repositories for multimodal-speech-emotion-recognition
Users that are interested in multimodal-speech-emotion-recognition are comparing it to the libraries listed below
Sorting:
- A Fully End2End Multimodal System for Fast Yet Effective Video Emotion Recognition☆39Aug 12, 2024Updated last year
- "MULTIMODAL EMOTION RECOGNITION BASED ON DEEP TEMPORAL FEATURES USING CROSS-MODAL TRANSFORMER AND SELF-ATTENTION" ICASSP'23☆23Feb 26, 2023Updated 3 years ago
- FRAME-LEVEL EMOTIONAL STATE ALIGNMENT METHOD FOR SPEECH EMOTION RECOGNITION☆23Dec 22, 2024Updated last year
- This package is essentially a ros-wrapper of neural_cam. More features would be added in the future, geared towards mobile robot platform…☆11Jul 12, 2019Updated 6 years ago
- MultiEMO: An Attention-Based Correlation-Aware Multimodal Fusion Framework for Emotion Recognition in Conversations (ACL 2023)☆93Nov 17, 2023Updated 2 years ago
- ☆11Jul 7, 2020Updated 5 years ago
- The implementation codes of paper: Multimodal Sentiment Analysis with Mutual Information-based Disentangled Representation Learning☆18May 8, 2025Updated 10 months ago
- A grunt.js task to render Handlebars templates against a context & produce HTML☆14Mar 10, 2018Updated 7 years ago
- A collection of OCR'd and machine-corrected Greek texts. This base repository contains Git submodules for the different works and an inve…☆11Nov 18, 2014Updated 11 years ago
- Speech understanding system training toolkit, including tasks of ASR, SSL, LM, etc.☆11Feb 12, 2026Updated 3 weeks ago
- The code and data for "Summary-Oriented Vision Modeling for Multimodal Abstractive Summarization"☆11May 16, 2023Updated 2 years ago
- IEEE T-BIOM : "Audio-Visual Fusion for Emotion Recognition in the Valence-Arousal Space Using Joint Cross-Attention"☆45Nov 29, 2024Updated last year
- Python and Scala APIs for enhanced Spark analytics☆12Mar 15, 2017Updated 8 years ago
- This is an official implementation in PyTorch of PTH-Net: Dynamic Facial Expression Recognition without Face Detection and Alignment..☆13Jul 1, 2025Updated 8 months ago
- Two-stage routing with Optimized Guided search and Greedy algorithm☆10Sep 27, 2023Updated 2 years ago
- ☆11Oct 24, 2022Updated 3 years ago
- A monolithic index that supports worst-case optimal joins (WCOJ) by providing all collation orders in a single redundancy eliminating dat…☆16Sep 18, 2025Updated 5 months ago
- ☆10May 24, 2021Updated 4 years ago
- ☆10Jul 16, 2024Updated last year
- 该仓库主要描述了CCAC2023多模态对话情绪识别评测第3名的实现过程☆11Aug 11, 2024Updated last year
- We present a study of a neural network based method for speech emotion recognition, using audio-only features. In the studied scheme, the…☆11Jul 24, 2024Updated last year
- Regularized latent variable mixed membership modeling☆13Aug 12, 2013Updated 12 years ago
- ☆18Oct 13, 2025Updated 4 months ago
- ☆13Oct 17, 2020Updated 5 years ago
- Awesome Multimodal Fusion in Speech Emotion Recognition☆13Nov 11, 2025Updated 3 months ago
- Code for paper "Cross-Domain Slot Filling as Machine Reading Comprehension" in IJCAI 2021☆11Aug 24, 2021Updated 4 years ago
- ☆10Oct 16, 2025Updated 4 months ago
- AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…☆11Feb 23, 2024Updated 2 years ago
- Risk Minimization Algorithms in Structured Prediction (JMLR 2016)☆13Jan 26, 2017Updated 9 years ago
- ☆11Nov 11, 2022Updated 3 years ago
- Communicate with the EVRYTHNG Cloud over MQTT for FreeRTOS devices☆10May 4, 2017Updated 8 years ago
- Implementation of Monte Carlo Word Movers Distance in Python with TensorFlow☆12Sep 12, 2016Updated 9 years ago
- ☆10Jan 18, 2024Updated 2 years ago
- MMER☆14Jan 8, 2026Updated 2 months ago
- JavaScript deployment for Howl, the wake word detection modeling toolkit for Firefox Voice☆10Aug 15, 2020Updated 5 years ago
- Groovy Neural Network library☆10Apr 22, 2017Updated 8 years ago
- Substitute alternative spellings of special characters (e.g. German umlauts [ae, oe, ue] and [ss]) with their correct versions (ä, ö, ü, …☆11Nov 24, 2024Updated last year
- A lecture summarization tool that uses AI and computer vision to summarize and index videos☆11Dec 8, 2022Updated 3 years ago
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Mar 14, 2025Updated 11 months ago