sungwanha / CsWatch
☆11Updated last year
Alternatives and similar repositories for CsWatch:
Users that are interested in CsWatch are comparing it to the libraries listed below
- c# project☆10Updated last month
- ☆10Updated last year
- Style-Controllable Zero-Shot Text to Speech Synthesizer based on VALL-E☆136Updated 3 months ago
- An official implementation of "UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data"☆133Updated last year
- perturbation_autovc☆18Updated last year
- The official implementation of EmoSphere-TTS☆105Updated last week
- ☆25Updated 8 months ago
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3☆184Updated 9 months ago
- Evaluation Protocol for Large-Scale Zero-Shot TTS Literature☆70Updated 4 months ago
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆140Updated 10 months ago
- ☆11Updated last month
- ☆136Updated 4 months ago
- Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (Interspeech 2022)☆114Updated 11 months ago
- Training code for FAcodec presented in NaturalSpeech3☆192Updated 5 months ago
- ☆65Updated last year
- ☆66Updated last week
- Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion☆140Updated last year
- used to evaluate wavenet vocoder by rmse f0, MCD, rmse ap...☆15Updated 5 years ago
- The open source code for SimpleSpeech series☆122Updated 3 months ago
- Unofficial pytorch implementation of BigVGAN: A Universal Neural Vocoder with Large-Scale Training☆133Updated last year
- ☆24Updated 10 months ago
- ☆115Updated 2 years ago
- Baseline Recipe for VoicePrivacy Challenge 2024: anonymization systems and evaluation software☆48Updated this week
- ☆15Updated last year
- An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for Custom Voice"☆96Updated 2 years ago
- End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions☆88Updated last year
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆229Updated 2 weeks ago
- ☆17Updated 10 months ago
- UTokyo-SaruLab MOS Prediction System☆129Updated last month
- UT-Sarulab MOS prediction system using SSL models☆202Updated 9 months ago