☆229Nov 13, 2023Updated 2 years ago
Alternatives and similar repositories for jtubespeech
Users that are interested in jtubespeech are comparing it to the libraries listed below
Sorting:
- ☆89Mar 5, 2021Updated 5 years ago
- Segment a given audio into utterances using a trained end-to-end ASR model.☆74Oct 9, 2020Updated 5 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆345May 15, 2024Updated last year
- HTS-style full-context labels for JSUT v1.1☆51Apr 16, 2021Updated 4 years ago
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge☆15Mar 26, 2022Updated 3 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago
- [ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…☆32Apr 8, 2022Updated 3 years ago
- Properly handle position-dependent phones in a subword lexicon FST☆31Oct 26, 2020Updated 5 years ago
- The People’s Speech Dataset☆113Jan 11, 2024Updated 2 years ago
- UT-Sarulab MOS prediction system using SSL models☆296Apr 11, 2024Updated last year
- Python wrapper for kaldi's arpa2fst☆38Aug 27, 2025Updated 6 months ago
- context labels and pronunciation data for JSUT corpus☆77Sep 2, 2021Updated 4 years ago
- xvector model on jtubespeech☆47Nov 5, 2023Updated 2 years ago
- ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)☆267Mar 7, 2023Updated 2 years ago
- A handy dataset of noises for ASR☆22May 29, 2019Updated 6 years ago
- Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva☆91Feb 18, 2025Updated last year
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆174Jun 9, 2023Updated 2 years ago
- ☆36Sep 20, 2022Updated 3 years ago
- A Japanese accent dictionary generator☆123Mar 21, 2024Updated last year
- A library for speech data augmentation in time-domain☆683Aug 30, 2021Updated 4 years ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- Large, modern dataset for speech recognition☆721Feb 26, 2024Updated 2 years ago
- Multilingual G2P in 100 languages☆375May 26, 2023Updated 2 years ago
- Interspeech 2019 tutorial materials☆49Sep 26, 2019Updated 6 years ago
- Tensorflow and kaldi implementation of our paper "VAE-based regularization for deep speaker embedding"☆11Mar 24, 2023Updated 2 years ago
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- Coco-Nut (Corpus of connecting NIHONGO utterance and text) corpus☆21Jun 12, 2024Updated last year