rendchevi / daisy-tts
πΌ Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition
β16Updated 11 months ago
Alternatives and similar repositories for daisy-tts:
Users that are interested in daisy-tts are comparing it to the libraries listed below
- The official implementation of EmoSphere++β70Updated last week
- Codec for paper: LLaSA: Scaling Train-time and Test-time Compute for LLaMA-based Speech Synthesisβ126Updated 2 weeks ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GPβ¦β90Updated 3 months ago
- An unofficial PyTorch implementation of VALL-Eβ87Updated this week
- Speaker change detection using SincNet and an LSTM/Transformerβ46Updated 7 months ago
- Zero-Shot Emotion Style Transferβ41Updated 9 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.β74Updated last month
- SelfRemaster: SSL Speech Restorationβ88Updated last year
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)β114Updated 2 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTSβ64Updated last year
- β63Updated 4 months ago
- PitchVC: Pitch Conditioned Any-to-Many Voice Conversionβ34Updated 7 months ago
- β35Updated 4 months ago
- This is the M-AILABS Speech Datasetβ38Updated 2 months ago
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variabilityβ98Updated 2 weeks ago
- β65Updated last week
- β66Updated 4 months ago
- SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesisβ118Updated 3 weeks ago
- β69Updated last year
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representationsβ139Updated 10 months ago
- β28Updated last year
- β33Updated 3 weeks ago
- β21Updated 5 months ago
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistencyβ49Updated 3 months ago
- β19Updated last year
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2β51Updated last year
- Application of MB-iSTFT-VITS components to vits2_pytorchβ121Updated 2 months ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.β15Updated 2 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.β78Updated last year
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordingsβ¦β74Updated 2 weeks ago