sarulab-speech/UTMOS22

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sarulab-speech/UTMOS22)

sarulab-speech / UTMOS22

UT-Sarulab MOS prediction system using SSL models

☆309

Alternatives and similar repositories for UTMOS22

Users that are interested in UTMOS22 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sarulab-speech / UTMOSv2
View on GitHub
UTokyo-SaruLab MOS Prediction System
☆355Apr 2, 2026Updated 3 months ago
tarepan / SpeechMOS
View on GitHub
Easy-to-Use Speech MOS predictors
☆360Oct 24, 2023Updated 2 years ago
unilight / LDNet
View on GitHub
Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"
☆68Dec 13, 2021Updated 4 years ago
nii-yamagishilab / mos-finetune-ssl
View on GitHub
☆112Jun 14, 2023Updated 3 years ago
Takaaki-Saeki / DiscreteSpeechMetrics
View on GitHub
Reference-aware automatic speech evaluation toolkit
☆185Dec 5, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
fakerybakery / utmos
View on GitHub
A toolkit to calculate speech audio quality. Not affiliated with the original authors
☆74Aug 13, 2024Updated last year
unilight / sheet
View on GitHub
Speech Human Evaluation Estimation Toolkit (SHEET)
☆137Mar 31, 2026Updated 3 months ago
sp-nitech / diffsptk
View on GitHub
A differentiable version of SPTK
☆201Jul 14, 2026Updated last week
lochenchou / MOSNet
View on GitHub
Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
☆380Jul 21, 2024Updated 2 years ago
gabrielmittag / NISQA
View on GitHub
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆963Dec 1, 2024Updated last year
yangdongchao / ALMTokenizer
View on GitHub
The demo page for ALMTokenizer
☆59Apr 14, 2025Updated last year
p0p4k / Matcha-TTS-2
View on GitHub
E2E TTS using Conditional Flow Matching (Experimental*)
☆71Nov 10, 2023Updated 2 years ago
ZhangXInFD / SpeechTokenizer
View on GitHub
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…
☆658Jun 9, 2024Updated 2 years ago
keonlee9420 / DailyTalk
View on GitHub
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
☆260Jun 5, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
voidful / Codec-SUPERB
View on GitHub
Audio Codec Speech processing Universal PERformance Benchmark
☆308Jul 4, 2026Updated 2 weeks ago
yangdongchao / AcademiCodec
View on GitHub
AcademiCodec: An Open Source Audio Codec Model for Academic Research
☆674Dec 27, 2023Updated 2 years ago
dhimasryan / MOSA-Net-Cross-Domain
View on GitHub
☆63May 31, 2024Updated 2 years ago
unilight / s3prl-vc
View on GitHub
S3PRL-VC: A Voice Conversion Toolkit based on S3PRL
☆101Mar 15, 2026Updated 4 months ago
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
NVIDIA / BigVGAN
View on GitHub
Official PyTorch implementation of BigVGAN (ICLR 2023)
☆1,227Sep 5, 2024Updated last year
richardbaihe / a3t
View on GitHub
Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
☆89Sep 6, 2024Updated last year
liusongxiang / ppg-vc
View on GitHub
PPG-Based Voice Conversion
☆348Jul 22, 2022Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
zhenye234 / xcodec
View on GitHub
AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
☆308Oct 12, 2025Updated 9 months ago
facebookresearch / speech-resynthesis
View on GitHub
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-S…
☆416Aug 29, 2023Updated 2 years ago
AndreevP / wvmos
View on GitHub
MOS score prediction by fine-tuned wav2vec2.0 model
☆180Oct 20, 2022Updated 3 years ago
line / LibriTTS-P
View on GitHub
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
☆161Jun 13, 2024Updated 2 years ago
yl4579 / PitchExtractor
View on GitHub
Deep Neural Pitch Extractor for Voice Conversion and TTS Training
☆151Aug 22, 2022Updated 3 years ago
jishengpeng / TextrolSpeech
View on GitHub
[ICASSP 2024] TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models
☆187Nov 22, 2024Updated last year
line / promptttspp
View on GitHub
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions
☆86Oct 11, 2024Updated last year
aliutkus / speechmetrics
View on GitHub
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
☆1,050Jul 5, 2023Updated 3 years ago
mct10 / RepCodec
View on GitHub
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
☆196Jul 12, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Wendison / VQMIVC
View on GitHub
Official implementation of VQMIVC: One-shot (any-to-any) Voice Conversion @ Interspeech 2021 + Online playing demo!
☆361Apr 27, 2022Updated 4 years ago
maum-ai / phaseaug
View on GitHub
ICASSP 2023 Accepted
☆191May 6, 2024Updated 2 years ago
gemelo-ai / vocos
View on GitHub
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
☆1,144Aug 7, 2024Updated last year
ajd12342 / paraspeechcaps
View on GitHub
Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'
☆162Mar 26, 2026Updated 3 months ago
yl4579 / PL-BERT
View on GitHub
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
☆269Jan 13, 2025Updated last year
nii-yamagishilab / ZMM-TTS
View on GitHub
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
☆184Mar 6, 2024Updated 2 years ago
yangdongchao / UniAudio
View on GitHub
The Open Source Code of UniAudio
☆605Jul 22, 2024Updated 2 years ago