cantabile-kwok / LSCodec-InferenceLinks

Inference code for Interspeech 2025 paper, "LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec"

☆24

Alternatives and similar repositories for LSCodec-Inference

Users that are interested in LSCodec-Inference are comparing it to the libraries listed below

Sorting:

ogunlao / glowtts_stdp
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆18Updated 2 years ago
meaningTeam / tidy-tunes
Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …
☆21Updated 2 months ago
mubtasimahasan / DM-Codec
Source code for DM-Codec.
☆49Updated 3 months ago
jisang93 / VISinger
Unofficial pytorch implementation of VISinger: Variational Inference with Adversarial Learning for End-to-end Singing Voice Synthesis (IC…
☆15Updated 2 years ago
ozspeech / OZSpeech
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆38Updated 7 months ago
xinshengwang / robpitch
A pitch detection model trained to be robust against noise and reverberation environments.
☆27Updated 7 months ago
lucadellalib / discrete-wavlm-codec
A neural speech codec based on discrete WavLM representations
☆24Updated last year
AI-S2-Lab / FluentEditor
[InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency
☆55Updated 10 months ago
ftshijt / Interspeech2024_DiscreteSpeechChallenge
This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.
☆32Updated last year
pengzhendong / streaming-vocos
Streaming Vocos
☆29Updated 3 months ago
yangdongchao / UniAudio2
The open-source code of UniAudio2.0
☆64Updated last week
DiFlow-TTS / DiFlow-TTS
DiFlow-TTS: Discrete Flow Matching with Factorized Speech Tokens for Low-Latency Zero-Shot Text-to-Speech
☆40Updated last month
shang0712 / HierTTS
☆45Updated 2 years ago
yangdongchao / ALMTokenizer2
The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…
☆42Updated last week
Tikai7 / DiTTO-TTS
DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors
☆32Updated 7 months ago
light1726 / SpeechTripleNet
The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"
☆34Updated last year
yxlu-0102 / IDEA-TTS
Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis
☆27Updated 5 months ago
justinlovelace / SESD
☆60Updated 10 months ago
Mddct / transformer-vocos
☆32Updated last week
gwh22 / LAFMA
LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)
☆39Updated last year
huutuongtu / Lightvoc
LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM
☆18Updated last year
lavendery / AudioComposer
☆22Updated this week
JusperLee / Gull-Codec-Training
☆13Updated 6 months ago
nonverbalspeech38k / nonverspeech38k
The official repository for NonVerbalSpeech-38K.
☆47Updated 2 weeks ago
SonyResearch / VRVQ
Variable Bitrate Residual Vector Quantization for Audio Coding
☆49Updated 4 months ago
zengchang233 / xiaoicesing2
The source code for the paper XiaoiceSing2 (interspeech2023)
☆47Updated last year
RanaCM / DSU-AVO
Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023
☆12Updated last year
AI-S2-Lab / GPT-Talker
[ACMMM'2024] Generative Expressive Conversational Speech Synthesis
☆38Updated 10 months ago
thuhcsi / SnakeGAN
Please visit https://thuhcsi.github.io/SnakeGAN/
☆37Updated 2 years ago
rishikksh20 / MiniMax-TTS-pytorch
Try to replicate the architecture of MiniMaxTTS mentioned in it's technical report
☆48Updated last week