alphacep / vosk-text
☆8Updated 2 years ago
Alternatives and similar repositories for vosk-text:
Users that are interested in vosk-text are comparing it to the libraries listed below
- ☆12Updated 3 months ago
- Forced alignment decoder for Whisper.☆14Updated last year
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆22Updated last year
- Wenet speech to text for react native☆10Updated 2 years ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.☆27Updated last year
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Updated 3 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Updated 4 years ago
- A handy dataset of noises for ASR☆21Updated 5 years ago
- ☆11Updated 2 years ago
- MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline. (Accepted by IALP'2022)☆19Updated 2 years ago
- ☆11Updated 3 years ago
- End-to-End SpeechSynthesis system with fastspeech2 & hifigan☆13Updated 2 years ago
- A collection of utilities for handling IPA phones.☆25Updated last year
- 4G GPU & 10 Minutes for train☆12Updated last year
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Updated last year
- Prosodic Speech Segmentation with Transformers☆25Updated last year
- ☆17Updated 2 years ago
- ☆17Updated 4 years ago
- ☆13Updated 8 months ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆17Updated 6 months ago
- ☆22Updated 3 years ago
- Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering☆20Updated last year
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆12Updated 7 months ago
- Conditional Variational Auto-Encoder with Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech☆22Updated 2 years ago
- Just another FastSpeech 2 but cleaner code :)☆26Updated 10 months ago
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.☆22Updated last month
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.☆18Updated 2 months ago
- Production-ready vocoder using BigVSAN☆11Updated last year
- This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core☆15Updated last year