bshall/dusted

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/bshall/dusted)

bshall / dusted

DUSTED: Spoken-Term Discovery using Discrete Speech Units

☆17

Alternatives and similar repositories for dusted

Users that are interested in dusted are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mushanshanshan / ESLTTS
View on GitHub
ESLTTS dataset
☆16Feb 6, 2025Updated last year
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
PhonemeHallucinator / Phoneme_Hallucinator
View on GitHub
☆48Aug 16, 2023Updated 2 years ago
iamanigeeit / present
View on GitHub
☆14Aug 19, 2024Updated last year
zhu-han / SpeechLLM
View on GitHub
LLM-based ASR recipe with Zipformer encoder and Qwen LLM
☆34Sep 25, 2025Updated 9 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
talhanai / kaldi-diar-latte
View on GitHub
steps to perform text-based speaker diarization with kaldi toolkit
☆12Nov 2, 2018Updated 7 years ago
lavendery / UUG
View on GitHub
☆21Sep 14, 2025Updated 10 months ago
LAION-AI / Vocalino-V0.1-Voice-Acting-Pipeline
View on GitHub
Open-weights voice acting pipeline combining zero-shot voice cloning with natural-language direction. Provide a reference voice (or gener…
☆17May 25, 2026Updated last month
Adibian / Persian-MultiSpeaker-Tacotron2
View on GitHub
Implementation of Transfer Learning from Speaker Verification to Multi-speaker Text-To-Speech Synthesis (SV2TTS) in Persian language.
☆13Oct 2, 2025Updated 9 months ago
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Oct 14, 2024Updated last year
bshall / urhythmic
View on GitHub
Unsupervised Rhythm Modeling for Voice Conversion
☆85Aug 3, 2023Updated 2 years ago
ogunlao / glowtts_stdp
View on GitHub
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆19Jun 5, 2023Updated 3 years ago
alobashev / mkl-vc
View on GitHub
[Interspeech 2025] Official implementation of "Training-Free Voice Conversion with Factorized Optimal Transport"
☆45Sep 24, 2025Updated 10 months ago
light1726 / SpeechTripleNet
View on GitHub
The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"
☆33Nov 23, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
m-wiesner / nnet_pytorch
View on GitHub
Kaldi style neural network training in pytorch for use in place of nnet3 in Kaldi.
☆26Jul 25, 2024Updated last year
prairie-schooner / wav2vec-vc
View on GitHub
☆10Mar 22, 2023Updated 3 years ago
audiolabs / PESQ
View on GitHub
PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band) - including P.862 Corrigendum 2 (03/…
☆23May 27, 2025Updated last year
BUTSpeechFIT / DiaPer
View on GitHub
☆69Feb 8, 2024Updated 2 years ago
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆54May 1, 2025Updated last year
thu-spmi / CTC-TTS
View on GitHub
Code for CTC-TTS: LLM-based dual-streaming text-to-speech with CTC alignment, Interspeech 2026.
☆20Jun 9, 2026Updated last month
PecholaL / MAIN-VC
View on GitHub
Lightweight Speech Representation Learning for One-Shot Voice Conversion
☆23Dec 12, 2024Updated last year
leto19 / WhiSQA
View on GitHub
Whisper Speech Quality Assessment (WhiSQA)
☆16Apr 14, 2026Updated 3 months ago
kamperh / linearvc
View on GitHub
Voice conversion with just linear regression.
☆37Sep 25, 2025Updated 9 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ubisoft / ubisoft-laforge-daft-exprt
View on GitHub
Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
☆127Apr 8, 2023Updated 3 years ago
adelacvg / DPTTS
View on GitHub
An AR+AR TTS attempt.
☆18Jan 13, 2025Updated last year
CMsmartvoice / Unet-TTS
View on GitHub
One-shot TTS with Improved Unseen Speaker and Style Transfer
☆37Mar 2, 2022Updated 4 years ago
kamperh / speech_dtw
View on GitHub
Dynamic time warping (DTW) functions for specifically speech alignment.
☆30May 6, 2024Updated 2 years ago
NeuroWave-ai / CUCVAE-TTS
View on GitHub
☆25Mar 12, 2022Updated 4 years ago
alumae / streaming-punctuator
View on GitHub
☆17Apr 14, 2023Updated 3 years ago
zerospeech / zerospeech2020
View on GitHub
Python package for the Zero Speech Challenge 2020
☆14Feb 5, 2021Updated 5 years ago
awslabs / speech-representations
View on GitHub
Code for DeCoAR (ICASSP 2020) and BERTphone (Odyssey 2020)
☆104Nov 26, 2022Updated 3 years ago
hs-oh-prml / DurFlexEVC
View on GitHub
☆82Jan 22, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
vectominist / MiniASR
View on GitHub
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
☆53Dec 6, 2022Updated 3 years ago
shengcanxu / canoSpeech
View on GitHub
text to speech
☆10Mar 19, 2024Updated 2 years ago
xiaoxue1117 / speech-mamba-public
View on GitHub
☆15Nov 26, 2024Updated last year
desh2608 / dover-lap
View on GitHub
Python package for combining diarization system outputs.
☆94Oct 12, 2023Updated 2 years ago
bshall / hifigan
View on GitHub
An 16kHz implementation of HiFi-GAN for soft-vc.
☆108Jul 19, 2023Updated 3 years ago
justinlovelace / SESD
View on GitHub
☆61Oct 28, 2024Updated last year
openaudiolab / LLaST
View on GitHub
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
☆26Aug 11, 2024Updated last year