45xjh / GPT-SoVITS-ForNoGUILinks
This repository is to implement the processes for users have no GUI
☆18Updated last year
Alternatives and similar repositories for GPT-SoVITS-ForNoGUI
Users that are interested in GPT-SoVITS-ForNoGUI are comparing it to the libraries listed below
Sorting:
- 基于语言学本体构建,全面覆盖汉语多音字、音变等现象的高效中文TTS数据集。A linguistically grounded and comprehensive Chinese TTS dataset, efficiently covering Chinese polyph…☆52Updated last year
- Bert-VITS2项目bug多且教程不友好。本proj尽可能修复了Bert-vits2项目的bug,并且可一键启动训练。仅需50条目标说话人语音,获得稳定、快速的TTS模型。☆65Updated 4 months ago
- A Survey of Spoken Dialogue Models (60 pages)☆313Updated last year
- a fully open-source implementation of a GPT-4o-like speech-to-speech video understanding model.☆36Updated 9 months ago
- ☆204Updated last year
- ☆176Updated last year
- ☆110Updated 3 months ago
- MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction mode…☆219Updated last year
- 使用vllm加速cosyvoice2的推理☆465Updated 8 months ago
- Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.☆189Updated last year
- ✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM☆360Updated 7 months ago
- IndexTTS Fine-tuning notebooks☆128Updated 6 months ago
- Unoffical implementation of Megatts2☆287Updated last year
- Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction☆216Updated 10 months ago
- TTS appalication based on modelscope KAN-TTS☆41Updated last year
- 重构GPT-SOVITS的项目,重写了部分代码,优化了webui的使用以及增加了api调用☆29Updated last year
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆177Updated 8 months ago
- X-Talk is an open-source full-duplex cascaded spoken dialogue system framework enabling low-latency, interruptible, and human-like speech…☆148Updated last week
- 基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏☆275Updated 2 years ago
- llama-omni训练代码复现☆73Updated 11 months ago
- CosyVoice_DPO_NOTES: Supercharge Your Cosyvoice model with Cutting-Edge DPO Fine-Tuning!☆110Updated 5 months ago
- The deme page of InstructTTS☆158Updated last year
- 基于PyTorch的VITS-BigVGAN的tts中文模型,加入韵律预测模型。☆197Updated 3 years ago
- This is a repository for fine-tuning Qwen2-Audio, currently supporting Distributed Data Parallel (DDP) and DeepSpeed.☆48Updated 5 months ago
- Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice☆494Updated 3 weeks ago
- The repo provides information about KeSpeech dataset.☆167Updated 3 years ago
- [ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training fo…☆1,034Updated last year
- 用于汇总目前的开源中文对话数据集☆197Updated 2 years ago
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆110Updated last year
- The repoduction codes for Qwen-Audio Fine-tuning☆53Updated last year