swagger-coder / visinger_lab
为visinger SVS系统写的展示系统~本质仍然是个音乐播放器
☆11Updated last year
Alternatives and similar repositories for visinger_lab:
Users that are interested in visinger_lab are comparing it to the libraries listed below
- Singing Voice Speech modeling test☆35Updated 2 years ago
- ☆39Updated last year
- UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts☆16Updated last month
- TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching☆17Updated last month
- a guide to grapheme-to-phoneme conversion and phoneme list for ace singing voice synthesis engine☆35Updated 2 weeks ago
- ☆55Updated 2 years ago
- Streaming Vocos☆19Updated 3 weeks ago
- ☆38Updated 4 months ago
- VAE modified from Descript Audio Codec, which replaces the RVQ with VAE☆63Updated 9 months ago
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆51Updated last year
- ☆56Updated last year
- ☆37Updated 7 months ago
- Huawei Grad-TTS for Chinese☆46Updated last year
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆42Updated last year
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆47Updated last year
- Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…☆32Updated 7 months ago
- The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"☆31Updated last year
- ☆41Updated last year
- Vocoder NSF-HiFiGAN (Moved into deepaudio)☆50Updated 2 years ago
- ☆44Updated last year
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆52Updated 2 months ago
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆16Updated last week
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆61Updated 2 months ago
- ☆65Updated last year
- ☆22Updated last year
- An open-source Kazakh Emotional Text-to-Speech Dataset☆27Updated 9 months ago
- g2p for english tts☆16Updated 2 years ago
- Unofficial Pytorch implementation of SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speake…☆57Updated last year
- The source code for the paper XiaoiceSing2 (interspeech2023)☆46Updated last year
- A pitch detection model trained to be robust against noise and reverberation environments.☆23Updated last week