abikaki / awesome-speech-emotion-recognitionLinks
π Awesome lists about Speech Emotion Recognition
β93Updated 6 months ago
Alternatives and similar repositories for awesome-speech-emotion-recognition
Users that are interested in awesome-speech-emotion-recognition are comparing it to the libraries listed below
Sorting:
- β66Updated 10 months ago
- Whisper-Flamingo [Interspeech 2024] and mWhisper-Flamingo [IEEE SPL 2025] for Audio-Visual Speech Recognition and Translationβ173Updated 2 months ago
- [INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmarkβ257Updated 3 months ago
- [Interspeech 2023] Intelligible Lip-to-Speech Synthesis with Speech Unitsβ40Updated 8 months ago
- EMO-SUPERB submissionβ44Updated 10 months ago
- A Compact and Effective Pretrained Model for Speech Emotion Recognitionβ41Updated last year
- This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".β148Updated 3 weeks ago
- [WACV 2023] Audio-Visual Efficient Conformer (AVEC) for Robust Speech Recognitionβ97Updated 2 years ago
- This is the audio sample repository for speech separation model "MossFormer2".β135Updated 7 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.β88Updated last year
- Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.β206Updated last year
- [INTERSPEECH 2024] The official implementation of EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for β¦β158Updated last month
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3β208Updated last year
- AudioBench: A Universal Benchmark for Audio Large Language Modelsβ234Updated last month
- VoiceLDM: Text-to-Speech with Environmental Contextβ181Updated 11 months ago
- [Interspeech 2024] SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronizationβ55Updated 3 months ago
- Training code for FAcodec presented in NaturalSpeech3β212Updated 10 months ago
- An implementation of Speech Emotion Recognition, based on HuBERT model, training with PyTorch and HuggingFace framework, and fine-tuning β¦β34Updated 3 years ago
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representationsβ166Updated last year
- [INTERSPEECH 2022] This dataset is designed for multi-modal speaker diarization and lip-speech synchronization in the wild.β52Updated last year
- β120Updated 2 years ago
- A collection of datasets for the purpose of emotion recognition/detection in speech.β354Updated 9 months ago
- VoiceBench: Benchmarking LLM-Based Voice Assistantsβ239Updated this week
- A curated list of awesome voice conversion, projects and communities.β239Updated 6 months ago
- Update ASR paper everydayβ259Updated this week
- Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three differenβ¦β235Updated 3 years ago
- List of direct speech-to-speech translation papers.β37Updated 2 years ago
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBOβ64Updated 2 years ago
- Speaker change detection using SincNet and an LSTM/Transformerβ53Updated last month
- PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Tβ¦β194Updated 2 years ago