cyanbx / Prompt-Singer
Implementation of Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt (NAACL'24).
☆96Updated this week
Alternatives and similar repositories for Prompt-Singer:
Users that are interested in Prompt-Singer are comparing it to the libraries listed below
- Findings of ACL 2023 | AlignSTS: a speech-to-singing (STS) model based on modality disentanglement and cross-modal alignment☆66Updated 6 months ago
- Robust Singing Voice Transcription and MIDI Extraction☆67Updated 2 months ago
- ☆66Updated 2 months ago
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆91Updated 3 months ago
- official code for CVPR'24 paper Diff-BGM☆54Updated 3 months ago
- VAE modified from Descript Audio Codec, which replaces the RVQ with VAE☆63Updated 9 months ago
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆98Updated 2 weeks ago
- [WIP] Unofficial Implementation of Microsoft's PromptTTS2☆51Updated last year
- CoMoSVC: One-Step Consistency Model Based Singing Voice Conversion & Singing Voice Clone☆135Updated 10 months ago
- Unofficial download repository for MusicCaps☆45Updated last year
- Vocoder NSF-HiFiGAN (Moved into deepaudio)☆50Updated 2 years ago
- ☆72Updated 2 years ago
- The latent diffusion model for text-to-music generation.☆165Updated last year
- ☆60Updated last year
- Diffusion Singing Voice Conversion based on Grad-TTS from HuaWei☆139Updated last year
- ☆37Updated 7 months ago
- Implementation of DCComix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer☆75Updated last year
- E2E TTS using Conditional Flow Matching (Experimental*)☆69Updated last year
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆74Updated last month
- Unofficial implementation JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models(https://arxiv.org/abs/2308.…☆52Updated last year
- Music generation☆24Updated 8 months ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆61Updated 2 months ago
- All generative model in one for better TTS model☆66Updated 4 months ago
- ACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis☆120Updated 4 months ago
- Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".☆80Updated last year
- The source code for the paper XiaoiceSing2 (interspeech2023)☆46Updated last year
- ☆69Updated 3 months ago
- dog-can-sing-song☆18Updated 2 months ago
- [InterSpeech 24] FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter☆86Updated 6 months ago
- Audio Prompt Adapter: Unleashing music editing abilities for text-to-music with lightweight finetuning [ISMIR 2024]☆46Updated 3 months ago