yl4579/SLMGAN

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yl4579/SLMGAN)

yl4579 / SLMGAN

SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs

☆16

Alternatives and similar repositories for SLMGAN

Users that are interested in SLMGAN are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
y-chan / hifi-gan-misrnet
View on GitHub
unofficial pytorch implementation of HiFi-GAN with fast MISR.
☆15Mar 21, 2023Updated 3 years ago
shengcanxu / canoSpeech
View on GitHub
text to speech
☆10Mar 19, 2024Updated 2 years ago
leibniz-future-lab / SelfDistill-SER
View on GitHub
☆18Apr 28, 2023Updated 3 years ago
yl4579 / PL-BERT
View on GitHub
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
☆270Jan 13, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
maum-ai / sane-tts
View on GitHub
SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
☆11Jun 30, 2023Updated 3 years ago
yl4579 / StyleTTS
View on GitHub
Official Implementation of StyleTTS
☆466Jan 13, 2025Updated last year
mushanshanshan / ESLTTS
View on GitHub
ESLTTS dataset
☆16Feb 6, 2025Updated last year
PriesiaMioShirakana / Pits-Japanese-Onnx
View on GitHub
PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor
☆17Apr 13, 2023Updated 3 years ago
lifeiteng / VoiceBox
View on GitHub
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
☆29Aug 4, 2023Updated 2 years ago
samsad35 / source-filter-vae
View on GitHub
[SpeechCom Journal] Learning and controlling the source-filter representation of speech with a variational autoencoder
☆46Apr 18, 2023Updated 3 years ago
CODEJIN / DiffSingerKR
View on GitHub
☆25Aug 31, 2024Updated last year
lakahaga / dc-comix-tts
View on GitHub
Implementation of DCComix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer
☆74Aug 21, 2023Updated 2 years ago
LAION-AI / Text-to-speech
View on GitHub
☆61Nov 4, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
MelissaChen15 / control-vc
View on GitHub
This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"
☆132Nov 29, 2023Updated 2 years ago
whzikaros / g2pL
View on GitHub
The implementation of g2pL with a new open dataset.
☆16May 14, 2023Updated 3 years ago
winddori2002 / TriAAN-VC
View on GitHub
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
☆146Jan 15, 2024Updated 2 years ago
MingjieChen / EasyVC
View on GitHub
A toolkit for any-to-any encoder-decoder voice conversion systems
☆83Aug 10, 2023Updated 2 years ago
yl4579 / StyleTTS-VC
View on GitHub
Official Implementation of StyleTTS-VC
☆200Jan 14, 2025Updated last year
oatsu-gh / utau_renderer_with_diff_svc
View on GitHub
Render wav and convert it with [Diff-SVC](https://github.com/prophesier/diff-svc) model
☆10Aug 24, 2025Updated 11 months ago
tenebo / g2pk2
View on GitHub
Updated folk of g2pk
☆13Aug 18, 2023Updated 2 years ago
PlayVoice / BigVGAN
View on GitHub
BigVGAN with Neural Source-Filter
☆58Sep 21, 2023Updated 2 years ago
p1an-lin-jung / wv_tts
View on GitHub
☆19Mar 22, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
X-LANCE / StoryTTS
View on GitHub
[ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations
☆141Apr 27, 2024Updated 2 years ago
MaxMax2016 / max-vc
View on GitHub
singing voice conversion without f0
☆23May 10, 2023Updated 3 years ago
scutcsq / Neural-Transducers-for-Two-Stage-Text-to-Speech-via-Semantic-Token-Prediction
View on GitHub
Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…
☆60Apr 4, 2024Updated 2 years ago
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
youcaiSUN / MuSe-Wild_2020
View on GitHub
☆12Aug 24, 2020Updated 5 years ago
yl4579 / AuxiliaryASR
View on GitHub
Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)
☆127Jun 16, 2022Updated 4 years ago
iisys-hof / HUI-Audio-Corpus-German
View on GitHub
This is the official repository for the HUI-Audio-Corpus-German. The corresponding paper is in the process of publication. With the repo…
☆35Mar 31, 2023Updated 3 years ago
chomeyama / DualCycleGAN
View on GitHub
Official implementation of DualCycleGAN for nonparallel audio super resolution
☆54Nov 1, 2022Updated 3 years ago
RF5 / simple-asgan
View on GitHub
Training code and trained checkpoints for ASGAN.
☆62Dec 27, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
glory20h / VoiceLDM
View on GitHub
VoiceLDM: Text-to-Speech with Environmental Context
☆194Aug 9, 2024Updated last year
anton-kashkin / hifi_vc
View on GitHub
☆25Jan 24, 2023Updated 3 years ago
fluxions-ai / stftvae
View on GitHub
Inference for the STFT-VAE continuous audio codec (24kHz, 3.125Hz latent)
☆43Jul 12, 2026Updated 2 weeks ago
MahtaFetrat / LLM-Powered-G2P
View on GitHub
Code and Resources for "LLM-Powered Grapheme-to-Phoneme Conversion: Benchmark and Case Study", introducing methods to leverage LLMs for G…
☆19May 21, 2025Updated last year
chomeyama / UnifiedSourceFilterGAN
View on GitHub
☆20Jun 5, 2022Updated 4 years ago
darashi / narabas
View on GitHub
narabas: Japanese phoneme forced alignment tool
☆15Mar 15, 2023Updated 3 years ago
ndkgit339 / spe-dss
View on GitHub
Speech Parameter Estimation Using Differentiable Speech Synthesizer
☆43May 9, 2023Updated 3 years ago