Keith-Hon/vits-cantonese

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Keith-Hon/vits-cantonese)

Keith-Hon / vits-cantonese

Cantonese Text to Speech with VITS implementation

☆37

Alternatives and similar repositories for vits-cantonese

Users that are interested in vits-cantonese are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zhaohb / MeloTTS-OV
View on GitHub
Using OpenVINO to speed up MeloTTS inference
☆15Nov 1, 2024Updated last year
mirfan899 / CTTS
View on GitHub
Cantonese TTS frontend
☆16Oct 14, 2019Updated 6 years ago
lifeiteng / Aligner-SUPERB
View on GitHub
Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark
☆38May 7, 2025Updated last year
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
cpii-cai / PunCantonese
View on GitHub
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆16Dec 3, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
iamanigeeit / present
View on GitHub
☆14Aug 19, 2024Updated last year
pengzhendong / compute-wer
View on GitHub
Compute WER and SER for speech recognition evaluation
☆26Jun 6, 2026Updated last month
Hannes1 / react-native-wenet
View on GitHub
Wenet speech to text for react native
☆10Nov 1, 2022Updated 3 years ago
giovana-morais / steme
View on GitHub
[ICASSP 2023] Tempo vs. Pitch: understanding self-supervised tempo estimation
☆13Aug 2, 2023Updated 2 years ago
Mddct / simple-tts
View on GitHub
（WIP）long form speech generatoins
☆31Apr 2, 2025Updated last year
Mddct / cosyvoice2-flow-optimized
View on GitHub
faster inference
☆28Jan 20, 2025Updated last year
pkufool / cppinyin
View on GitHub
Converting Chinese sentences into pinyin sequences, implemented in C++, very fast and easy to deploy.
☆23Jan 5, 2026Updated 6 months ago
Mddct / transformer-vocos
View on GitHub
☆36Sep 6, 2025Updated 10 months ago
projectlucas / efficient_whisper
View on GitHub
Robust Speech Recognition via Large-Scale Weak Supervision
☆19Dec 1, 2022Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
dengcunqin / noise-reduction
View on GitHub
noise reduction
☆17Jul 3, 2024Updated 2 years ago
PINTO0309 / onnx-aec
View on GitHub
A playground for experimenting with acoustic echo cancellation using a microphone, speaker, and ONNX.
☆13Oct 22, 2024Updated last year
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated 11 months ago
athena-team / DiDiSpeech
View on GitHub
☆45Oct 24, 2020Updated 5 years ago
bagustris / s3prl-ser
View on GitHub
S3PRL for Speech Emotion Recognition (see s3prl > downstream)
☆15Feb 28, 2026Updated 4 months ago
pengzhendong / asr-decoder
View on GitHub
CTC decoder with hotwords for ASR.
☆36Jun 15, 2026Updated 3 weeks ago
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Oct 14, 2024Updated last year
cantabile-kwok / vec2wav2.0
View on GitHub
Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995
☆79Dec 3, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
frankyoujian / Edge-Punct-Casing
View on GitHub
☆33Feb 4, 2025Updated last year
CjangCjengh / chinese-dialect-lexicons
View on GitHub
Grapheme-to-Phoneme lexicons for Chinese dialects
☆70Nov 20, 2022Updated 3 years ago
schufo / tisms
View on GitHub
This is the code of the ICASSP 2020 paper "Joint phoneme alignment and text-informed speech separation on highly corrupted speech"
☆16Apr 8, 2024Updated 2 years ago
pengzhendong / g2p-mix
View on GitHub
Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.
☆115Dec 2, 2025Updated 7 months ago
yukara-ikemiya / Open-Miipher-2
View on GitHub
PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind
☆69Sep 22, 2025Updated 9 months ago
MorenoLaQuatra / vad
View on GitHub
Simple voice activity detection (VAD) algorithm in Python
☆15Aug 10, 2023Updated 2 years ago
diyclassics / perseus-lookup
View on GitHub
Lookup Latin/Greek vocabulary from the command line using Python/Beautiful Soup.
☆15Dec 30, 2025Updated 6 months ago
utter-project / fairseq
View on GitHub
This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.
☆21Nov 19, 2024Updated last year
bookbot-hive / k2-indonesian-asr
View on GitHub
Indonesian speech/phoneme recognizer powered by Kaldi 2.0 (lhotse, icefall, sherpa).
☆15Jun 30, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
primepake / dac_vae
View on GitHub
Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder
☆37Aug 30, 2025Updated 10 months ago
kyegomez / USM
View on GitHub
Implementation of Google's USM speech model in Pytorch
☆36Jun 22, 2026Updated 2 weeks ago
sungnyun / ARMHuBERT
View on GitHub
(Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT
☆41Aug 29, 2024Updated last year
RF5 / transfusion-asr
View on GitHub
Transcribing Speech with Multinomial Diffusion, training code and models.
☆80Sep 27, 2023Updated 2 years ago
twardoch / audiostretchy
View on GitHub
AudioStretchy is a Python wrapper around the `audio-stretch` C library, which performs fast, high-quality time-stretching of WAV/MP3 file…
☆61Sep 24, 2025Updated 9 months ago
EMRAI / emrai-synthetic-diarization-corpus
View on GitHub
☆21Sep 24, 2018Updated 7 years ago
linlsyf / CantoneseDict
View on GitHub
粤语开放词典
☆10Oct 13, 2023Updated 2 years ago