ZaVang / GPT-SoVits
重构GPT-SOVITS的项目,重写了部分代码,优化了webui的使用以及增加了api调用
☆20Updated last month
Alternatives and similar repositories for GPT-SoVits:
Users that are interested in GPT-SoVits are comparing it to the libraries listed below
- Bert-VITS2项目bug多且教程不友好。本proj尽可能修复了Bert-vits2项目的bug,并且可一键启动训练。仅需50条目标说话人语音,获得稳定、快速的TTS模型。☆41Updated 3 months ago
- Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.☆83Updated this week
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆92Updated 3 weeks ago
- ☆13Updated 7 months ago
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆51Updated last year
- Huawei Grad-TTS for Chinese☆45Updated last year
- Official repo for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations☆43Updated this week
- ☆65Updated last year
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆78Updated 9 months ago
- ☆15Updated 2 months ago
- VITS with phoneme-level prosody modeling based on MaskGIT☆79Updated 4 months ago
- 基于达摩院视频切割技术的视频转换为短音频的vits数据集生成工具 A VITS Dataset Generation Tool for Converting Video to Short Audio Based on Damo Academy Video Cutting T…☆54Updated last year
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆57Updated 2 months ago
- CoMoSVC: One-Step Consistency Model Based Singing Voice Conversion & Singing Voice Clone☆135Updated 9 months ago
- Training code for MaskGCT-T2S model.☆18Updated last month
- Inference code for Audiodec-Valle-Wenetspeech4TTS☆48Updated 6 months ago
- ChatTTS is a generative speech model for daily dialogue.☆21Updated last week
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆138Updated 7 months ago
- Implementation of DCComix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer☆75Updated last year
- GPT-SoVITS2☆208Updated 5 months ago
- ☆20Updated this week
- Bert-VITS2 onnx推理版本☆40Updated 8 months ago
- [AAAI 2025] VQTalker: Towards Multilingual Talking Avatars through Facial Motion Tokenization☆34Updated last month
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆71Updated last year
- ☆26Updated this week
- Implementation of Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt (NAACL'24).☆95Updated last week
- Chinese and English Bilinguish G2P☆20Updated last year
- The MAVD represents Mandarin Audio-Visual dataset with Depth information. MAVD has a rich variety of modal data, including audio, RGB ima…☆17Updated 8 months ago