swagger-coder / visinger_labLinks
为visinger SVS系统写的展示系统~本质仍然是个音乐播放器
☆11Updated 2 years ago
Alternatives and similar repositories for visinger_lab
Users that are interested in visinger_lab are comparing it to the libraries listed below
Sorting:
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆44Updated last month
- ☆55Updated 3 years ago
- This repo is text to speech with learnable audio encoder without alignment with transcript reference☆39Updated last month
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆52Updated last year
- Findings of ACL 2023 | AlignSTS: a speech-to-singing (STS) model based on modality disentanglement and cross-modal alignment☆68Updated last year
- ☆23Updated last year
- ☆66Updated 2 years ago
- MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows☆98Updated last month
- Streaming Vocos☆29Updated 4 months ago
- Official implementation of Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models☆41Updated 7 months ago
- The source code for the paper XiaoiceSing2 (interspeech2023)☆47Updated last year
- A unified tokenizer that is capable of both extracting semantic information and enabling high-fidelity audio reconstruction.☆105Updated last month
- Official implementation of the paper titled "Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Mu…☆27Updated last year
- ☆14Updated 11 months ago
- Unofficial Pytorch implementation of SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speake…☆57Updated 2 years ago
- Music generation☆24Updated last year
- Inference code for Audiodec-Valle-Wenetspeech4TTS☆50Updated last year
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Updated last year
- ☆23Updated 4 months ago
- STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation☆57Updated 3 months ago
- Bilingual Singing Voice Synthesis☆18Updated last year
- ☆54Updated 3 months ago
- An open-source Kazakh Emotional Text-to-Speech Dataset☆31Updated 2 months ago
- DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors☆33Updated 8 months ago
- g2p for english tts☆19Updated 2 years ago
- ☆24Updated 2 years ago
- faster inference☆28Updated 9 months ago
- ☆19Updated 2 years ago
- Vocoder NSF-HiFiGAN (Moved into deepaudio)☆55Updated 2 years ago
- The source code for the paper CrossSinger (asru2023)☆18Updated 2 years ago