ORI-Muchim / One-Click-MB-iSTFT-VITS2
MB-iSTFT-VITS2(Data Preprocessing + Whisper + Text Preprocessing + Making config.json + Training, Inference) ONE-CLICK
☆12Updated last year
Alternatives and similar repositories for One-Click-MB-iSTFT-VITS2:
Users that are interested in One-Click-MB-iSTFT-VITS2 are comparing it to the libraries listed below
- 'Grad-TTS' with Multilingual Cleaners☆10Updated 11 months ago
- ☆13Updated 5 months ago
- Bilingual-TTS (Japanese and Korean)☆30Updated last year
- ☆28Updated last year
- ☆13Updated 7 months ago
- Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"☆12Updated last month
- ArtSpeech: Adaptive Text-to-Speech Synthesis with Articulatory Representations☆18Updated last month
- Simple inference for Vits2 TTS Using ONNXRUNTIME and espeak-ng on C++☆16Updated 11 months ago
- singing voice conversion based on glow-tts☆11Updated last year
- StyleTTS 2 Optimized Training Fork☆26Updated last month
- AudioSR-Upsampling (any -> 48kHz)☆40Updated last year
- Multi-speaker Speech Synthesis Using VITS(KO, JA, EN, ZH)☆73Updated last year
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆22Updated this week
- ☆26Updated last month
- ☆26Updated last year
- Official Demo Page for DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer☆32Updated last month
- Unofficial pytorch implementation of VISinger: Variational Inference with Adversarial Learning for End-to-end Singing Voice Synthesis (IC…☆15Updated last year
- Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis☆21Updated last week
- Japanese Dataset to Multi Language TTS (Only for Japanese Dataset)☆3Updated last year
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆15Updated 5 months ago
- VI-SVC model is just VITS without MAS and DurationPredictor.☆10Updated last year
- A collection of all our phonemeizers for dataset construction and inference☆22Updated last month
- 4G GPU & 10 Minutes for train☆12Updated last year
- Cantonese Text to Speech with VITS implementation☆29Updated last year
- ☆35Updated 11 months ago
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆12Updated 2 weeks ago
- Aligner for text-to-speech☆14Updated 8 months ago
- Chinese and English Bilinguish G2P☆20Updated last year
- VITS(Data Preprocessing + Whisper ASR + Text Preprocessing + Modification config.json + Training, Inference)☆38Updated last year
- ☆39Updated last year