sarulab-speech/Sidon

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sarulab-speech/Sidon)

sarulab-speech / Sidon

Training code and dataset cleasing with Sidon

☆167

Alternatives and similar repositories for Sidon

Users that are interested in Sidon are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yukara-ikemiya / Open-Miipher-2
View on GitHub
PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind
☆70Sep 22, 2025Updated 10 months ago
tabahi / contexless-phonemes-CUPE
View on GitHub
pytorch model for contexless-phoneme prediction from speech audio
☆32Oct 30, 2025Updated 8 months ago
yukara-ikemiya / wavefit-pytorch
View on GitHub
PyTorch implementation of WaveFit [2022, Google] which is one of SOTA lightweight/fast speech vocoders.
☆70Jul 13, 2026Updated last week
YangXusheng-yxs / CodecFormer_5Hz
View on GitHub
☆35Oct 23, 2025Updated 9 months ago
Atotti / miipher-2
View on GitHub
Googleの音声復元モデルMiipher-2の再現実装の学習および推論コード。学習済みモデルも公開しています。
☆32Feb 7, 2026Updated 5 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
ZhikangNiu / Semantic-VAE
View on GitHub
[INTERSPEECH 2026 Oral]Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
☆121Jun 21, 2026Updated last month
IDEA-Emdoor-Lab / DistilCodec
View on GitHub
A Neural Audio Codec (NAC) for Universal Audio
☆47May 30, 2025Updated last year
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆54May 1, 2025Updated last year
Wataru-Nakata / miipher
View on GitHub
Unofficial implementation of miipher
☆137Apr 19, 2024Updated 2 years ago
malradhi / PACodec
View on GitHub
[ICASSP 2026]Official code for "Prosody-Guided Harmonic Attention for Phase-Coherent Neural Vocoding in the Complex Spectrum"
☆27Jan 22, 2026Updated 6 months ago
ajaybati / miipher2.0
View on GitHub
Reimplementation of Miipher
☆30Aug 16, 2023Updated 2 years ago
Tencent / StableToken
View on GitHub
[ICLR 2026] StableToken: A state-of-the-art noise-robust semantic speech tokenizer featuring Voting-LFQ for resilient SpeechLLMs.
☆33Feb 27, 2026Updated 4 months ago
Blinorot / utmos-pytorch
View on GitHub
Unofficial fairseq-free PyTorch implementation of UTMOS (v1, 2022), matching the original system.
☆35Jun 6, 2026Updated last month
sarulab-speech / UTMOSv2
View on GitHub
UTokyo-SaruLab MOS Prediction System
☆356Apr 2, 2026Updated 3 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
HeCheng0625 / Diffusion-Speech-Tokenizer
View on GitHub
This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaD…
☆198Jan 25, 2026Updated 6 months ago
ex3ndr / supervoice-hybrid
View on GitHub
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Aug 5, 2024Updated last year
IDEA-Emdoor-Lab / UniTTS
View on GitHub
A TTS Trained on Universal Audio.
☆41Jun 6, 2025Updated last year
gyt1145028706 / XY-Tokenizer
View on GitHub
This is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs
☆97Sep 19, 2025Updated 10 months ago
Ruiqi-Yan / Awesome-Audio-Editing
View on GitHub
A curated list of models, benchmarks, tools and guides for audio editing
☆34Jul 7, 2026Updated 2 weeks ago
line / WaveTrainerFit
View on GitHub
Official implementation of "Wave-Trainer-Fit: Neural Vocoder with Trainable Prior and Fixed-Point Iteration towards High-Quality Speech G…
☆16Feb 6, 2026Updated 5 months ago
kyutai-labs / tts_longeval
View on GitHub
☆30Apr 29, 2026Updated 2 months ago
ozspeech / OZSpeech
View on GitHub
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆45Feb 9, 2025Updated last year
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
zhenye234 / X-Codec-2.0
View on GitHub
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
☆360Jun 25, 2026Updated last month
taresh18 / TTSizer
View on GitHub
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets
☆142Aug 10, 2025Updated 11 months ago
sarulab-speech / xvector_jtubespeech
View on GitHub
xvector model on jtubespeech
☆47Nov 5, 2023Updated 2 years ago
inworld-ai / tts
View on GitHub
Inworld TTS
☆736Jul 13, 2026Updated last week
liduojia1 / MeanFlowSE
View on GitHub
☆43Jan 26, 2026Updated 6 months ago
Aworselife / DPTBF
View on GitHub
☆17Sep 12, 2023Updated 2 years ago
wavlab-speech / versa
View on GitHub
Versatile Evaluation of Speech and Audio
☆424Updated this week
AmphionTeam / FlexiCodec
View on GitHub
[ICLR2026] FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates
☆50Jul 1, 2026Updated 3 weeks ago
ajd12342 / paraspeechcaps
View on GitHub
Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'
☆163Mar 26, 2026Updated 4 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
smallbraineng / smalltts
View on GitHub
superfast text to speech in any voice
☆62Feb 16, 2026Updated 5 months ago
yangdongchao / ALMTokenizer
View on GitHub
The demo page for ALMTokenizer
☆59Apr 14, 2025Updated last year
k2-fsa / Flow2GAN
View on GitHub
Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation
☆145Mar 8, 2026Updated 4 months ago
ictnlp / SLED-TTS
View on GitHub
Streamable Text-to-Speech model using a language modeling approach, without vector quantization
☆108May 20, 2025Updated last year
HuangZikang-TJU / Aug4TSE
View on GitHub
☆15Sep 16, 2024Updated last year
PalabraAI / redimnet2
View on GitHub
This repository contains the official implementation and pretrained weights for the paper "ReDimNet2: Scaling Speaker Verification via Ti…
☆65Jul 9, 2026Updated 2 weeks ago
Mddct / simple-tts
View on GitHub
（WIP）long form speech generatoins
☆30Apr 2, 2025Updated last year