TomJwYu/WenetSpeechSpeakerCluster

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TomJwYu/WenetSpeechSpeakerCluster)

TomJwYu / WenetSpeechSpeakerCluster

☆55

Alternatives and similar repositories for WenetSpeechSpeakerCluster

Users that are interested in WenetSpeechSpeakerCluster are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

makerjackie / tts-frontend-dataset
View on GitHub
TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization
☆104Feb 5, 2024Updated 2 years ago
mct10 / RepCodec
View on GitHub
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
☆196Jul 12, 2024Updated 2 years ago
lifeiteng / TTS-TextAnalyzer
View on GitHub
TTS Text Analyzer
☆31Jul 20, 2023Updated 3 years ago
LAION-AI / emotional-speech-annotations
View on GitHub
This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models
☆35Oct 13, 2024Updated last year
lifeiteng / VoiceBox
View on GitHub
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
☆29Aug 4, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
0913ktg / SC_VALL-E
View on GitHub
Style-Controllable Zero-Shot Text to Speech Synthesizer based on VALL-E
☆136Oct 23, 2024Updated last year
Zain-Jiang / Speech-Editing-Toolkit
View on GitHub
It's a repository for implementations of neural speech editing algorithms.
☆206Jan 9, 2024Updated 2 years ago
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
yangdongchao / ALMTokenizer2
View on GitHub
The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…
☆45Sep 5, 2025Updated 10 months ago
lifeiteng / SoundStorm
View on GitHub
☆71Jul 13, 2023Updated 3 years ago
y-ren16 / TiCodec
View on GitHub
☆81Aug 11, 2025Updated 11 months ago
fss1t / CausalStarGANv2-VC
View on GitHub
☆22Apr 4, 2023Updated 3 years ago
yangdongchao / SimpleSpeech
View on GitHub
The open source code for SimpleSpeech series
☆147Oct 8, 2024Updated last year
TTS-Research / PEL-TTS
View on GitHub
☆14Aug 16, 2023Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
haoxiangsnr / llm-tse
View on GitHub
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)
☆43Oct 13, 2023Updated 2 years ago
audiodemo / voice-conversion
View on GitHub
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Aug 18, 2023Updated 2 years ago
y-chan / hifi-gan-misrnet
View on GitHub
unofficial pytorch implementation of HiFi-GAN with fast MISR.
☆15Mar 21, 2023Updated 3 years ago
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 8 months ago
WangHelin1997 / LibriLightMix-WHAMR
View on GitHub
Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
thuhcsi / SpanPSP
View on GitHub
☆76Apr 26, 2022Updated 4 years ago
SJTMusicTeam / Muskits
View on GitHub
An opensource music processing toolkit
☆321Jun 25, 2023Updated 3 years ago
X-LANCE / UniCATS-CTX-txt2vec
View on GitHub
[AAAI 2024] CTX-txt2vec, the acoustic model in UniCATS
☆64Nov 18, 2024Updated last year
yangdongchao / LLM-Codec
View on GitHub
The open source code for LLM-Codec
☆147Aug 18, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
wenet-e2e / wetts
View on GitHub
Production First and Production Ready End-to-End Text-to-Speech Toolkit
☆416Nov 20, 2025Updated 8 months ago
KdaiP / StableTTS
View on GitHub
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
☆437Sep 13, 2024Updated last year
Infinity-INF / fast-phasr
View on GitHub
Phonemes and durations labeling based on whisper small
☆11Jul 7, 2024Updated 2 years ago
duyichao / NPDA-KNN-ST
View on GitHub
Official implementation of EMNLP'2022 paper "Non-Parametric Domain Adaptation for End-to-End Speech Translation"
☆11Oct 26, 2022Updated 3 years ago
shang0712 / HierTTS
View on GitHub
☆47Apr 16, 2023Updated 3 years ago
kimsunwiub / BLOOM-Net
View on GitHub
Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"
☆14Feb 13, 2022Updated 4 years ago
Liangzheng-ZL / BEdit-TTS
View on GitHub
Speech samples and code of BEdit-TTS
☆34Oct 8, 2023Updated 2 years ago
thuhcsi / SECap
View on GitHub
☆181Jul 9, 2024Updated 2 years ago
ASLP-lab / OmniCodec
View on GitHub
OmniCodec: Low Frame Rate Universal Audio Codec with Semantic–Acoustic Disentanglement
☆46Apr 17, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
yangdongchao / ALMTokenizer
View on GitHub
The demo page for ALMTokenizer
☆59Apr 14, 2025Updated last year
cantabile-kwok / vec2wav2.0
View on GitHub
Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995
☆79Dec 3, 2024Updated last year
Mddct / simple-tts
View on GitHub
（WIP）long form speech generatoins
☆30Apr 2, 2025Updated last year
yangdongchao / SoundStorm
View on GitHub
The reproduced code for Google's SoundStorm
☆275Oct 7, 2023Updated 2 years ago
Daisyqk / Automatic-Prosody-Annotation
View on GitHub
☆112Mar 9, 2026Updated 4 months ago
zjwang21 / mix-phoneme-bert
View on GitHub
An unofficial PyTorch implementation of Mix-Phoneme-Bert
☆40Jul 10, 2023Updated 3 years ago
Aria-K-Alethia / laughter-synthesis
View on GitHub
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆77Jul 16, 2023Updated 3 years ago