A collection of datasets for the purpose of emotion recognition/detection in speech.
☆400Sep 30, 2024Updated last year
Alternatives and similar repositories for SER-datasets
Users that are interested in SER-datasets are comparing it to the libraries listed below
Sorting:
- This is the GitHub page for publicly available emotional speech data.☆381Jan 6, 2022Updated 4 years ago
- Lightweight and Interpretable ML Model for Speech Emotion Recognition and Ambiguity Resolution (trained on IEMOCAP dataset)☆442Dec 21, 2023Updated 2 years ago
- Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…☆77Jul 16, 2023Updated 2 years ago
- The Emotional Voices Database: Towards Controlling the Emotional Expressiveness in Voice Generation Systems☆283Oct 10, 2023Updated 2 years ago
- [INTERSPEECH 2024] EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark☆309Mar 31, 2025Updated 11 months ago
- Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'☆140Jan 6, 2025Updated last year
- This repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data…☆137Oct 24, 2021Updated 4 years ago
- [ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training fo…☆1,063Dec 23, 2024Updated last year
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆214Sep 10, 2024Updated last year
- Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995☆78Dec 3, 2024Updated last year
- Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023☆252Jun 5, 2025Updated 9 months ago
- A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration…☆328Sep 24, 2022Updated 3 years ago
- PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.☆330Feb 9, 2024Updated 2 years ago
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆96Nov 20, 2024Updated last year
- The official repository for the paper “NonVerbalSpeech-38K: A Scalable Pipeline for Enabling Non-Verbal Speech Generation and Understandi…☆63Dec 26, 2025Updated 2 months ago
- [ACII 2023] PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Spe…☆60Jul 1, 2024Updated last year
- Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing☆71Dec 2, 2022Updated 3 years ago
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆183Updated this week
- ICASSP 2023 Accepted☆190May 6, 2024Updated last year
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆184Mar 6, 2024Updated last year
- [ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"☆368Sep 3, 2024Updated last year
- How to use our public wav2vec2 dimensional emotion model☆540May 22, 2023Updated 2 years ago
- Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition☆153Oct 26, 2021Updated 4 years ago
- Dataset and baseline code for the VocalSound dataset (ICASSP2022).☆161Nov 12, 2022Updated 3 years ago
- ☆29Mar 8, 2022Updated 3 years ago
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆59Oct 23, 2024Updated last year
- A toolkit for any-to-any encoder-decoder voice conversion systems☆84Aug 10, 2023Updated 2 years ago
- Keep track of big models in audio domain, including speech, singing, music etc.☆506Sep 26, 2024Updated last year
- ☆121Oct 24, 2022Updated 3 years ago
- Text-To-Speech for NotebookLM☆39Jul 20, 2025Updated 7 months ago
- LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning☆159Jun 13, 2024Updated last year
- ☆69Jul 29, 2023Updated 2 years ago
- UniSpeech - Large Scale Self-Supervised Learning for Speech☆479Apr 5, 2024Updated last year
- ☆259May 15, 2023Updated 2 years ago
- A library for speech data augmentation in time-domain☆683Aug 30, 2021Updated 4 years ago
- 🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).☆2,133Jun 6, 2024Updated last year
- Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'☆155Mar 24, 2025Updated 11 months ago
- Speech Emotion Recognition using transfer learning with wav2vec on IEMOCAP.☆17Aug 8, 2021Updated 4 years ago
- Self-Supervised Speech Pre-training and Representation Learning Toolkit☆2,533Jun 13, 2025Updated 8 months ago