0417keito / UTAUTAILinks
UTAUTAI(Unrestricted Tune Automated Technology Artificial Interigence)
☆12Updated last year
Alternatives and similar repositories for UTAUTAI
Users that are interested in UTAUTAI are comparing it to the libraries listed below
Sorting:
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆17Updated 11 months ago
- ☆19Updated last year
- My vocoder experiments☆31Updated last month
- ☆28Updated last year
- speaker-disentangled speech linguistic content quantizer☆22Updated 6 months ago
- A TTS Trained on Universal Audio.☆39Updated 3 months ago
- Non Parallel Voice Conversion based on VITS☆24Updated 2 years ago
- ☆16Updated last year
- Codebase and project page for EDMSound☆34Updated last year
- Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.☆30Updated 3 years ago
- Zero-Shot Emotion Style Transfer☆49Updated 4 months ago
- PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions☆84Updated 11 months ago
- [ICASSP 2025] FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion☆76Updated last month
- ☆22Updated 11 months ago
- ☆11Updated 10 months ago
- AudioSR-Upsampling (any -> 48kHz)☆41Updated last year
- Simple and lightweight Zero-shot Text-to-Speech (TTS) synthesis model☆33Updated 4 months ago
- 60k hours of phoneme-aligned audio from audio books☆19Updated last year
- Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.☆34Updated 2 years ago
- Just another FastSpeech 2 but cleaner code :)☆27Updated last year
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆52Updated last year
- ☆43Updated last year
- A Neural Audio Codec (NAC) for Universal Audio☆42Updated 3 months ago
- BigVGAN with Neural Source-Filter☆55Updated 2 years ago
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆31Updated 2 years ago
- ☆15Updated 2 months ago
- ☆37Updated last year
- a lightweight voice conversion☆84Updated last year
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆54Updated last year
- GPT-style network for phonemization with durations of text☆67Updated last year