Choddeok / DiEmoTTSLinks

[INTERSPEECH 2025] The official implementation of DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for Cross-Speaker Emotion Transfer in Text-to-Speech

☆13

Alternatives and similar repositories for DiEmoTTS

Users that are interested in DiEmoTTS are comparing it to the libraries listed below

Sorting:

shang0712 / HierTTS
☆45Updated 2 years ago
thuhcsi / DiffVar
☆30Updated last year
ftshijt / Interspeech2024_DiscreteSpeechChallenge
This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.
☆32Updated last year
ogunlao / glowtts_stdp
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆18Updated 2 years ago
Audio-Foundation-Models / ConversationTTS
☆74Updated 2 weeks ago
jiaqili3 / DualCodec
A Low-Frame-Rate, Semantically-Enhanced Neural Audio Codec for Speech Generation
☆34Updated 2 weeks ago
Dapwner / CVAE-Tacotron
☆23Updated last year
scutcsq / Neural-Transducers-for-Two-Stage-Text-to-Speech-via-Semantic-Token-Prediction
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…
☆61Updated last year
light1726 / SpeechTripleNet
The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"
☆33Updated last year
thuhcsi / SnakeGAN
Please visit https://thuhcsi.github.io/SnakeGAN/
☆37Updated 2 years ago
lucadellalib / discrete-wavlm-codec
A neural speech codec based on discrete WavLM representations
☆24Updated 9 months ago
AI-Unicamp / TTS-Objective-Metrics
Objective metrics used in several text-to-speech (TTS) papers.
☆49Updated this week
WangHelin1997 / DuTa-VC
Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…
☆37Updated last year
ozspeech / OZSpeech
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆36Updated 4 months ago
xinshengwang / robpitch
A pitch detection model trained to be robust against noise and reverberation environments.
☆26Updated 5 months ago
gwh22 / LAFMA
LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)
☆38Updated last year
yangdongchao / ALMTokenizer2
The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…
☆26Updated last month
kaistmm / fregrad
☆31Updated last year
mubtasimahasan / DM-Codec
Source code for DM-Codec.
☆45Updated 3 weeks ago
francislata / unicats
An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".
☆26Updated last year
3loi / NaturalVoices
☆54Updated 7 months ago
cpdu / vallt
☆36Updated 3 months ago
zy-du / Disentanglement-of-Emotional-Style-and-Speaker-Identity-for-Expressive-Voice-Conversion
This is the implementation our Interspeech 2022 paper " Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conv…
☆20Updated last year
ductuantruong / speaker_age_estimation_ssl_study
Official implementation of the APSIPA 2022 paper: Exploring Speaker Age Estimation on Different Self-Supervised Learning Models
☆14Updated 2 years ago
caizexin / GenVC
Self-supervised Generative LM-based Voice Conversion
☆37Updated last month
exercise-book-yq / Supercodec
☆47Updated 2 months ago
rishikksh20 / MiniMax-TTS-pytorch
Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report
☆33Updated last month
walker-hyf / NCSSD
Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)
☆59Updated 7 months ago
cantabile-kwok / vec2wav2.0
Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995
☆76Updated 6 months ago
AlanBaade / SyllableLM
Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models
☆55Updated last month