seanghay / vits.cppLinks

VITS Inference using ONNX Runtime on C++

☆13

Alternatives and similar repositories for vits.cpp

Users that are interested in vits.cpp are comparing it to the libraries listed below

Sorting:

frankyoujian / Edge-Punct-Casing
☆29Updated last year
pengzhendong / torchfa
Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.
☆62Updated 5 months ago
lovemefan / Silero-vad-pytorch
silero-vad pytorch implement
☆34Updated last year
liuhuang31 / Megatts2_HierSpeechpp
Megatts2 use HierSpeechpp's vocoder
☆18Updated last year
jiay7 / wenet_onlinedecode
Went online decode demo
☆31Updated 4 years ago
k2-fsa / colab
Colab notebooks for Next-gen Kaldi
☆29Updated 3 months ago
pengzhendong / pyannote-onnx
ONNX Inference of Pyannote Segmentation
☆97Updated last year
NTIA / alignnet
Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.
☆18Updated 6 months ago
wetdog / wavenext_pytorch
Unofficial implementation of wavenext vocoder
☆56Updated last year
pirxus / personalVAD
An unofficial implementation of the Personal VAD speaker-conditioned voice activity detection method. Bachelor's thesis project.
☆79Updated 3 years ago
pengzhendong / asr-decoder
CTC decoder with hotwords for ASR.
☆34Updated 9 months ago
hs-oh-prml / DurFlexEVC
☆82Updated last year
k2-fsa / Flow2GAN
Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-Step High-Fidelity Audio Generation
☆134Updated 2 weeks ago
sarulab-speech / Sidon
Training code and dataset cleasing with Sidon
☆75Updated 3 weeks ago
Kevin-naticl / LLaSE-G1
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement
☆94Updated 10 months ago
frank613 / CTC-based-GOP
This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024
☆34Updated 3 weeks ago
pengzhendong / pysilero
Python Wrapper of Silero VAD
☆64Updated 9 months ago
tabahi / contexless-phonemes-CUPE
pytorch model for contexless-phoneme prediction from speech audio
☆30Updated 3 months ago
desh2608 / diarizer
Clustering-based methods for overlapping diarization
☆82Updated 2 years ago
liyunlongaaa / NSD-MS2S
CHIME-7/8 diarization champion system: neural speaker diarization using memory-aware multi-speaker embedding with sequence-to-sequence ar…
☆83Updated 7 months ago
OlaWod / PitchVC
PitchVC: Pitch Conditioned Any-to-Many Voice Conversion
☆36Updated last year
seastar105 / pflow-encodec
Implementation of TTS model based on NVIDIA P-Flow TTS Paper
☆77Updated last year
cantabile-kwok / vec2wav2.0
Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995
☆78Updated last year
Stylish-TTS / stylish-tts
High quality text-to-speech based on StyleTTS 2.
☆71Updated last month
lars76 / fastspeech2-clean
Clean and modernized implementation of FastSpeech2/LightSpeech using IPA
☆17Updated last year
adelacvg / detail_tts
All generative model in one for better TTS model
☆74Updated last year
ncsoft / PhonMatchNet
Official implementation of "PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords" (INTERSPEECH 2023)
☆59Updated last year
k2-fsa / text_search
Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup
☆79Updated 7 months ago
uthree / tinyvc
a lightweight voice conversion
☆86Updated last year
Ephrem-ETH / E2E-KWS
End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM
☆43Updated 3 years ago