cyanbx/Prompt-Singer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cyanbx/Prompt-Singer)

cyanbx / Prompt-Singer

Implementation of Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt (NAACL'24).

☆119

Alternatives and similar repositories for Prompt-Singer

Users that are interested in Prompt-Singer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RickyL-2000 / ROSVOT
View on GitHub
Robust Singing Voice Transcription and MIDI Extraction
☆123Nov 20, 2024Updated last year
BiSinger-SVS / BiSinger
View on GitHub
Bilingual Singing Voice Synthesis
☆18Mar 25, 2024Updated 2 years ago
freds0 / free-svc
View on GitHub
[ICASSP 2025] FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion
☆95Jul 23, 2025Updated 11 months ago
rishikksh20 / MiniMax-TTS-pytorch
View on GitHub
Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report
☆47Sep 2, 2025Updated 10 months ago
RickyL-2000 / AlignSTS
View on GitHub
Findings of ACL 2023 | AlignSTS: a speech-to-singing (STS) model based on modality disentanglement and cross-modal alignment
☆68Jul 5, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
bytedance / Make-An-Audio-2
View on GitHub
a text-conditional diffusion probabilistic model capable of generating high fidelity audio.
☆197May 29, 2024Updated 2 years ago
thuhcsi / SnakeGAN
View on GitHub
Please visit https://thuhcsi.github.io/SnakeGAN/
☆37Apr 25, 2023Updated 3 years ago
zengchang233 / xiaoicesing2
View on GitHub
The source code for the paper XiaoiceSing2 (interspeech2023)
☆49Jan 15, 2024Updated 2 years ago
gwx314 / TechSinger
View on GitHub
TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching
☆100Apr 2, 2026Updated 3 months ago
KTTRCDL / UMETTS
View on GitHub
UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts
☆41Jun 12, 2025Updated last year
scutcsq / Neural-Transducers-for-Two-Stage-Text-to-Speech-via-Semantic-Token-Prediction
View on GitHub
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…
☆60Apr 4, 2024Updated 2 years ago
winddori2002 / DEX-TTS
View on GitHub
DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability
☆108Jan 17, 2025Updated last year
streichgeorg / autosing
View on GitHub
☆18Jan 20, 2025Updated last year
lesterphillip / SVCC23_FastSVC
View on GitHub
Singing Voice Conversion Challenge 2023 Starter Kit: FastSVC Reimplementation
☆116Nov 25, 2023Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Text-to-Audio / Make-An-Audio-3
View on GitHub
Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers
☆121May 19, 2025Updated last year
zhangyongmao / VISinger2
View on GitHub
VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer
☆355Nov 4, 2024Updated last year
qiuqiao / SOFA
View on GitHub
SOFA: Singing-Oriented Forced Aligner
☆226May 16, 2025Updated last year
X-E-Speech / X-E-Speech-code
View on GitHub
X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
☆112Apr 1, 2024Updated 2 years ago
hayeong0 / DDDM-VC
View on GitHub
Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for V…
☆244Jul 31, 2024Updated last year
PlayVoice / VI-SVS
View on GitHub
Singing Voice Synthesis based on VITS, different from VISinger
☆198Nov 13, 2023Updated 2 years ago
AI-S2-Lab / FluentEditor
View on GitHub
[InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency
☆62Oct 23, 2024Updated last year
0417keito / PromptTTS2
View on GitHub
[WIP] Unofficial Implementation of Microsoft's PromptTTS2
☆53Oct 31, 2023Updated 2 years ago
Audio-Foundation-Models / ConversationTTS
View on GitHub
☆101Jan 19, 2026Updated 6 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
AaronZ345 / TCSinger
View on GitHub
PyTorch Implementation of TCSinger(EMNLP 2024): Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
☆383Oct 7, 2025Updated 9 months ago
cnaigithub / Auto_Tuning_Zeroshot_TTS_and_VC
View on GitHub
Official implementation of "Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis",…
☆80May 29, 2023Updated 3 years ago
adelacvg / detail_tts
View on GitHub
All generative model in one for better TTS model
☆74Sep 8, 2024Updated last year
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
WelkinYang / Learn2Sing2.0
View on GitHub
Diffusion and Mutual Information-Based Target Speaker SVS by Learning from Singing Teacher
☆182Apr 28, 2023Updated 3 years ago
p1an-lin-jung / wv_tts
View on GitHub
☆19Mar 22, 2024Updated 2 years ago
line / LibriTTS-P
View on GitHub
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
☆161Jun 13, 2024Updated 2 years ago
RMSnow / HAT
View on GitHub
Official repository for "Structure-Enhanced Pop Music Generation via Harmony-Aware Learning", ACM MM 2022.
☆14Mar 22, 2023Updated 3 years ago
line / promptttspp
View on GitHub
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions
☆86Oct 11, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
hrnoh24 / stream-vc
View on GitHub
An unofficial PyTorch implementation of the StreamVC(Real-Time Low-Latency Voice Conversion)
☆129Jun 11, 2026Updated last month
haidog-yaqub / DiffPitcher
View on GitHub
Diffusion-based singing voice pitch correction
☆144Sep 20, 2024Updated last year
CODEJIN / HiFiSinger
View on GitHub
☆111Jun 11, 2021Updated 5 years ago
WangHelin1997 / Automatic_Speech_Annotator
View on GitHub
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Jun 14, 2024Updated 2 years ago
M4Singer / M4Singer
View on GitHub
☆227Dec 29, 2022Updated 3 years ago
AaronZ345 / GTSinger
View on GitHub
Dataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing…
☆377Aug 15, 2025Updated 11 months ago
zhenye234 / CoMoSpeech
View on GitHub
ACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
☆214Apr 26, 2024Updated 2 years ago