alphacep / vosk-text
☆8Updated last year
Related projects: ⓘ
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated 7 months ago
- ☆11Updated 2 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆27Updated last year
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 2 years ago
- ☆17Updated last year
- Prosodic Speech Segmentation with Transformers☆22Updated 6 months ago
- Unofficial implementation of wavenext vocoder☆28Updated 3 weeks ago
- Collection of scripts from mHuBERT-147.☆21Updated 2 months ago
- ☆27Updated 6 months ago
- ☆15Updated last month
- ☆10Updated 11 months ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated last year
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆16Updated 6 months ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning and Rye☆12Updated 4 months ago
- ☆11Updated last year
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆10Updated 2 weeks ago
- Simple inference for Vits2 TTS Using ONNXRUNTIME and espeak-ng on C++☆11Updated 5 months ago
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆29Updated last month
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge☆14Updated 2 years ago
- Conditional Variational Auto-Encoder with Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech☆22Updated 2 years ago
- ☆16Updated 3 years ago
- This is a TTS model based on VITS that can control the output speech emotion through natural language and control the speaker through ref…☆4Updated last month
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- A handy dataset of noises for ASR☆19Updated 5 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆12Updated 3 years ago
- Zero-Shot Foreign Accent Conversion without a Native Reference☆27Updated 4 months ago
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆13Updated last year
- 4G GPU & 10 Minutes for train☆12Updated last year
- The TTSDS benchmark evaluates synthetic speech quality by considering prosody, speaker identity, and intelligibility, comparing these fac…☆14Updated 3 weeks ago
- End-to-End SpeechSynthesis system with fastspeech2 & hifigan☆13Updated 2 years ago