microsoft/UniSpeech

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/microsoft/UniSpeech)

microsoft / UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

☆486

Alternatives and similar repositories for UniSpeech

Users that are interested in UniSpeech are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

s3prl / s3prl
View on GitHub
Self-Supervised Speech Pre-training and Representation Learning Toolkit
☆2,557Mar 12, 2026Updated 4 months ago
ZhangXInFD / SpeechTokenizer
View on GitHub
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…
☆658Jun 9, 2024Updated 2 years ago
huawei-noah / Speech-Backbones
View on GitHub
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
☆604Sep 18, 2023Updated 2 years ago
tarepan / SpeechMOS
View on GitHub
Easy-to-Use Speech MOS predictors
☆360Oct 24, 2023Updated 2 years ago
facebookresearch / AudioDec
View on GitHub
An Open-source Streaming High-fidelity Neural Audio Codec
☆511Mar 4, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
modelscope / FunCodec
View on GitHub
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music gener…
☆445Jan 25, 2024Updated 2 years ago
SpeechColab / GigaSpeech
View on GitHub
Large, modern dataset for speech recognition
☆731Feb 26, 2024Updated 2 years ago
lhotse-speech / lhotse
View on GitHub
Tools for handling multimodal data in machine learning projects.
☆1,143Jun 22, 2026Updated last month
X-LANCE / VoiceFlow-TTS
View on GitHub
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
☆376Sep 3, 2024Updated last year
facebookresearch / WavAugment
View on GitHub
A library for speech data augmentation in time-domain
☆689Aug 30, 2021Updated 4 years ago
sarulab-speech / UTMOS22
View on GitHub
UT-Sarulab MOS prediction system using SSL models
☆308Apr 11, 2024Updated 2 years ago
mct10 / RepCodec
View on GitHub
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
☆196Jul 12, 2024Updated 2 years ago
yangdongchao / LLM-Codec
View on GitHub
The open source code for LLM-Codec
☆147Aug 18, 2024Updated last year
liusongxiang / ppg-vc
View on GitHub
PPG-Based Voice Conversion
☆348Jul 22, 2022Updated 4 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
yangdongchao / UniAudio
View on GitHub
The Open Source Code of UniAudio
☆605Jul 22, 2024Updated 2 years ago
microsoft / NeuralSpeech
View on GitHub
☆1,461Feb 11, 2024Updated 2 years ago
wenet-e2e / wespeaker
View on GitHub
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
☆1,367Jul 8, 2026Updated 2 weeks ago
chomeyama / SiFiGAN
View on GitHub
Official implementation of the source-filter HiFiGAN vocoder
☆275Jul 29, 2023Updated 2 years ago
wenet-e2e / wetts
View on GitHub
Production First and Production Ready End-to-End Text-to-Speech Toolkit
☆416Nov 20, 2025Updated 8 months ago
yangdongchao / AcademiCodec
View on GitHub
AcademiCodec: An Open Source Audio Codec Model for Academic Research
☆674Dec 27, 2023Updated 2 years ago
Ereboas / MagiCodec
View on GitHub
A single-layer, streaming codec model providing SOTA audio quality and discrete tokens designed for superior downstream modelability.
☆125Jun 4, 2025Updated last year
ddlBoJack / Speech-Resources
View on GitHub
语音方向实验室/公司/资源/实习等，欢迎推荐或自荐
☆609Nov 13, 2024Updated last year
NVIDIA / BigVGAN
View on GitHub
Official PyTorch implementation of BigVGAN (ICLR 2023)
☆1,227Sep 5, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
descriptinc / descript-audio-codec
View on GitHub
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
☆1,841Jul 16, 2026Updated last week
Snowdar / asv-subtools
View on GitHub
An Open Source Tools for Speaker Recognition
☆638Aug 5, 2024Updated last year
gemelo-ai / vocos
View on GitHub
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
☆1,145Aug 7, 2024Updated last year
aliutkus / speechmetrics
View on GitHub
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
☆1,050Jul 5, 2023Updated 3 years ago
ga642381 / speech-trident
View on GitHub
Awesome speech/audio LLMs, representation learning, and codec models
☆1,240Jul 10, 2026Updated 2 weeks ago
k2-fsa / k2
View on GitHub
FSA/FST algorithms, differentiable, with PyTorch compatibility.
☆1,348Jul 11, 2026Updated 2 weeks ago
descriptinc / cargan
View on GitHub
Official repository for the paper "Chunked Autoregressive GAN for Conditional Waveform Synthesis"
☆193Dec 8, 2022Updated 3 years ago
rishikksh20 / Avocodo-pytorch
View on GitHub
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
☆122Jul 14, 2022Updated 4 years ago
hayeong0 / Diff-HierVC
View on GitHub
Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Pr…
☆237Jul 3, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
google / speaker-id
View on GitHub
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…
☆453Aug 12, 2025Updated 11 months ago
facebookresearch / CPC_audio
View on GitHub
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.
☆374Oct 12, 2021Updated 4 years ago
voidful / Codec-SUPERB
View on GitHub
Audio Codec Speech processing Universal PERformance Benchmark
☆308Jul 4, 2026Updated 3 weeks ago
facebookresearch / libri-light
View on GitHub
dataset for lightly supervised training using the librivox audio book recordings. https://librivox.org/.
☆528Jul 11, 2023Updated 3 years ago
nttcslab-sp / EEND-vector-clustering
View on GitHub
This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-cl…
☆81Oct 18, 2022Updated 3 years ago
auspicious3000 / contentvec
View on GitHub
speech self-supervised representations
☆520Apr 27, 2023Updated 3 years ago
clovaai / voxceleb_trainer
View on GitHub
In defence of metric learning for speaker recognition
☆1,170Apr 22, 2026Updated 3 months ago