daanzu / speech-training-recorderLinks
Simple GUI application to help record audio dictated from given text prompts, for use with training speech recognition or speech synthesis.
☆41Updated 3 years ago
Alternatives and similar repositories for speech-training-recorder
Users that are interested in speech-training-recorder are comparing it to the libraries listed below
Sorting:
- Pytorch implementation of Deepmind's WaveRNN model☆121Updated 5 years ago
- SailAlign is an open-source software toolkit for robust long speech-text alignment implementing an adaptive, iterative speech recognition…☆98Updated 3 years ago
- ☆40Updated last year
- OpenAI Whisper Prompt Examples☆52Updated last year
- python wrapper for rnnoise library☆48Updated 2 years ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆99Updated 8 months ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆15Updated last year
- Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-gramma…☆20Updated 3 years ago
- a python library for speech enhancement☆80Updated last year
- An unofficial PyTorch implementation of the StreamVC(Real-Time Low-Latency Voice Conversion)☆124Updated 10 months ago
- Add n-gram and large language model (LLM) support to Whisper models.☆26Updated last month
- automatically align transcribed audio and generate a wav2letter training corpus☆36Updated 2 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆52Updated last month
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.☆23Updated 3 months ago
- ☆80Updated last year
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆88Updated 2 months ago
- On-device voice activity detection (VAD) powered by deep learning☆218Updated last week
- ☆17Updated 4 years ago
- An unofficial implementation of the Personal VAD speaker-conditioned voice activity detection method. Bachelor's thesis project.☆70Updated 2 years ago
- A high-quality, varied ~30hr voice dataset suitable for training a TTS model☆60Updated 2 years ago
- STT Service based on Kaldi ASR☆15Updated 6 years ago
- StyleTTS2 + Vocos as a Decoder☆12Updated 3 months ago
- Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021☆39Updated 3 years ago
- Deep Convolution Text to Speech☆35Updated 7 years ago
- pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with Transformer paper☆21Updated 3 years ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated 3 weeks ago
- ☆54Updated last year
- A Python-based modular toolbox for building Deep Neural Network models (using PyTorch) for statistical parametric speech synthesis☆23Updated 3 years ago
- VCTK multi-speaker tacotron for ICASSP 2020☆266Updated 3 years ago
- Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies☆14Updated 7 months ago