sarulab-speech / jtubespeechLinks
☆221Updated last year
Alternatives and similar repositories for jtubespeech
Users that are interested in jtubespeech are comparing it to the libraries listed below
Sorting:
- ☆87Updated 4 years ago
- context labels and pronunciation data for JSUT corpus☆69Updated 3 years ago
- HTS-style full-context labels for JSUT v1.1☆47Updated 4 years ago
- xvector model on jtubespeech☆45Updated last year
- End-to-End Neural Diarization☆402Updated 3 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆336Updated last year
- UT-Sarulab MOS prediction system using SSL models☆237Updated last year
- Onnx wrapper for espnet infrernce model☆162Updated 7 months ago
- A pure python module for reading and writing kaldi ark files☆259Updated 2 months ago
- An advance kaldi wrapper for Pyhton☆38Updated 4 years ago
- ESPnet Model Zoo☆251Updated last year
- ☆32Updated 2 years ago
- Unofficial implementation of miipher☆125Updated last year
- Official implementation of the source-filter HiFiGAN vocoder☆252Updated last year
- Multilingual G2P in 100 languages☆327Updated 2 years ago
- ☆48Updated 4 months ago
- see README☆347Updated 10 months ago
- Speaker embedding (d-vector) trained with GE2E loss☆282Updated last year
- Easy-to-Use Speech MOS predictors☆288Updated last year
- Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.☆88Updated 2 years ago
- A Survey on Neural Speech Synthesis https://arxiv.org/pdf/2106.15561.pdf☆370Updated 3 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆114Updated 2 years ago
- JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech☆110Updated 2 years ago
- End-to-end MOdeling of ASR (Automatic Speech Recognition)☆33Updated 2 years ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆194Updated 8 months ago
- A torch implementation of a recursion which turns out to be useful for RNN-T.☆141Updated last year
- A curated list of awesome papers on contextualizing E2E ASR outputs☆77Updated 2 years ago
- Charsiu: A neural phonetic aligner.☆301Updated 2 years ago
- ☆272Updated 4 years ago
- ☆79Updated last year