valiakon / MultimodalAnalysis_SpeakerDiarization
The project tries to solve a speaker diarization problem using audio features, face recognition and video feature extraction from face image, mouth tracking.
☆15Updated 6 years ago
Alternatives and similar repositories for MultimodalAnalysis_SpeakerDiarization:
Users that are interested in MultimodalAnalysis_SpeakerDiarization are comparing it to the libraries listed below
- [ICASSP 2023] Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations☆36Updated last year
- ☆104Updated 2 years ago
- Repository for code and paper submitted for APSIPA 2019, Lanzhou, China☆22Updated 6 months ago
- Human emotions are one of the strongest ways of communication. Even if a person doesn’t understand a language, he or she can very well u…☆24Updated 3 years ago
- ICASSP 2023: 'Speaker recognition with two-step multi-modal deep cleansing'☆38Updated 2 years ago
- Noise15 , Noisex-92 and Nonspeech☆38Updated 4 years ago
- an Audio-Visual Voice Activity Detection using Deep Learning☆48Updated 5 years ago
- Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)☆18Updated 6 months ago
- This is the official code for paper "Speech Emotion Recognition with Global-Aware Fusion on Multi-scale Feature Representation" published…☆46Updated 2 years ago
- TensorFlow implementation of "Attentive Modality Hopping for Speech Emotion Recognition," ICASSP-20☆32Updated 4 years ago
- This repository contains the code for our ICASSP paper `Speech Emotion Recognition using Semantic Information` https://arxiv.org/pdf/2103…☆24Updated 3 years ago
- ☆41Updated 4 years ago
- A Compact and Effective Pretrained Model for Speech Emotion Recognition☆32Updated 7 months ago
- ☆10Updated 3 years ago
- Official implementation of the paper "SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transfor…☆20Updated last year
- [ICASSP 2020] Speech Emotion Recognition with Dual-Sequence LSTM Architecture☆12Updated 3 weeks ago
- ☆17Updated 3 years ago
- Attention Backend for Aotumatic Speaker Verification with Multiple Enrollment Utterances☆49Updated 2 years ago
- Pytorch implementation of RawNeXt: Speaker verification system for variable-duration utterance with deep layer aggregation and dynamic sc…☆23Updated 2 years ago
- Implementation of the paper "Attentive Statistics Pooling for Deep Speaker Embedding" in Pytorch☆43Updated 4 years ago
- Automatic speech emotion recognition based on transfer learning from spectrograms using ResNET☆21Updated 2 years ago
- Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.☆17Updated 2 years ago
- DWFormer: Dynamic Window Transformer for Speech Emotion Recognition(ICASSP 2023 Oral)☆57Updated 7 months ago
- Implementation of Hybrid CTC/Attention Architecture for End-to-End Speech Recognition in pure python and PyTorch☆25Updated 6 months ago
- Code for paper "Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition"☆39Updated last year
- This repository contains code for applying Data2Vec to pretrain Keyword Transformer model as described in "Improving Label-Deficient Keyw…☆28Updated last month
- 语音增强☆15Updated 3 years ago
- [ICASSP 2024] Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition☆21Updated 10 months ago
- End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM☆39Updated 2 years ago
- ☆15Updated 2 months ago