bagustris / ssl-ser
Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"
☆9Updated last year
Related projects ⓘ
Alternatives and complementary repositories for ssl-ser
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- ☆13Updated last month
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆13Updated last month
- Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"☆17Updated 3 years ago
- Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit☆13Updated last year
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated last year
- Prosodic Speech Segmentation with Transformers☆23Updated 8 months ago
- MSP-Podcast Challenge Baseline Code☆16Updated 4 months ago
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆12Updated 2 months ago
- Code for the paper "FLowHigh: Towards efficient and high-quality audio super-resolution with single-step flow matching"☆16Updated this week
- Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-…☆19Updated last month
- ☆13Updated last year
- End-to-End SpeechSynthesis system with fastspeech2 & hifigan☆13Updated 2 years ago
- Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering☆18Updated last year
- This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…☆17Updated last year
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech☆10Updated 11 months ago
- Conditional Variational Auto-Encoder with Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech☆22Updated 2 years ago
- Goodness of Pronunciation algorithm using PyKaldi☆13Updated 2 years ago
- ☆41Updated last year
- ☆10Updated last year
- ☆11Updated 3 years ago
- ☆15Updated 3 years ago
- This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024☆13Updated last week
- Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms☆18Updated last year
- ☆18Updated 2 months ago
- Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…☆34Updated 11 months ago
- ☆16Updated 2 years ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 2 years ago
- ☆11Updated last year
- End-to-end diarization loss☆22Updated 3 years ago