rendchevi / daisy-tts
πΌ Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition
β16Updated last year
Alternatives and similar repositories for daisy-tts:
Users that are interested in daisy-tts are comparing it to the libraries listed below
- The official implementation of EmoSphere++β80Updated 2 weeks ago
- VITS-based zero-shot TTS system varying with diverse style/speaker conditioning methods.β36Updated 2 years ago
- Zero-Shot Emotion Style Transferβ43Updated 11 months ago
- a Frontier Japanese Speech Generation netβ28Updated 3 weeks ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,β¦β68Updated 6 months ago
- An unofficial PyTorch implementation of VALL-Eβ87Updated this week
- β69Updated last year
- β68Updated 7 months ago
- pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with Transformer paperβ22Updated 2 years ago
- β36Updated 6 months ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTSβ63Updated last year
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversionβ85Updated last year
- β29Updated last year
- Unsupervised Rhythm Modeling for Voice Conversionβ80Updated last year
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.β17Updated 4 months ago
- Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesisβ21Updated 2 weeks ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPβ¦β95Updated 5 months ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordingsβ¦β82Updated 2 months ago
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representationsβ150Updated last year
- PitchVC: Pitch Conditioned Any-to-Many Voice Conversionβ34Updated 9 months ago
- All generative model in one for better TTS modelβ66Updated 6 months ago
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)β120Updated 2 years ago
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistencyβ51Updated 5 months ago
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variabilityβ101Updated 2 months ago
- Official Implementation of StyleTTS-VCβ177Updated 2 months ago
- β41Updated this week
- End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutionsβ92Updated last year
- Implementation of Emo-StarGANβ45Updated last year
- β64Updated 6 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.β61Updated 3 weeks ago