MaxMax2016 / VI-Speaker

Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.

☆29

Related projects ⓘ

Alternatives and complementary repositories for VI-Speaker

shang0712 / HierTTS
☆44Updated last year
choiHkk / VITSinger
Singing Voice Speech modeling test
☆35Updated 2 years ago
ex3ndr / supervoice-librilight-preprocessed
60k hours of phoneme-aligned audio from audio books
☆18Updated 3 months ago
choiHkk / CVAEJETS
Conditional Variational Auto-Encoder with Jointly Training FastSpeech2(+Conformer) and HiFi-GAN for End to End Text to Speech
☆46Updated 2 years ago
adelacvg / diff-vits
☆39Updated last year
hcy71o / SC-CNN
SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker Text-to-Speech Systems
☆39Updated last year
hcy71o / MB-iSTFT-VITS-with-AutoVocoder
Incorporating AutoVocoder to MB-iSTFT-VITS
☆46Updated last year
yuan1615 / AdaVocoder
Adaptive Vocoder for Custom Voice
☆59Updated 2 years ago
anton-kashkin / hifi_vc
☆25Updated last year
AlexandaJerry / SingingVoice-MFA-Training
MFA acoustic model training based on Opencpop
☆12Updated 2 years ago
OlaWod / PitchVC
PitchVC: Pitch Conditioned Any-to-Many Voice Conversion
☆34Updated 5 months ago
AI-S2-Lab / GPT-Talker
[ACMMM'2024] Generative Expressive Conversational Speech Synthesis
☆28Updated 3 weeks ago
innnky / descript-audio-vae
VAE modified from Descript Audio Codec, which replaces the RVQ with VAE
☆54Updated 7 months ago
shivammehta25 / BetterFastSpeech2
Just another FastSpeech 2 but cleaner code :)
☆25Updated 4 months ago
jisang93 / VISinger
Unofficial pytorch implementation of VISinger: Variational Inference with Adversarial Learning for End-to-end Singing Voice Synthesis (IC…
☆15Updated last year
seahore / PPG-GradVC
A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis
☆43Updated last year
hcy71o / SNAC
Unofficial Pytorch implementation of SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speake…
☆56Updated last year
MiscellaneousStuff / PhoneLM
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆46Updated last year
p0p4k / vits3_pytorch
☆28Updated last year
ogunlao / glowtts_stdp
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆18Updated last year
Aria-K-Alethia / laughter-synthesis
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆71Updated last year
francislata / unicats
An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".
☆22Updated last year
choiHkk / VAEJETS
Conditional Variational Auto-Encoder with Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech
☆22Updated 2 years ago
voidful / vall-e-encodec
☆41Updated last year
Choddeok / EmoSpherepp
The official implementation of EmoSphere++
☆41Updated 2 weeks ago
CODEJIN / XiaoiceSing2
☆19Updated last year