daanzu / speech-training-recorder
Simple GUI application to help record audio dictated from given text prompts, for use with training speech recognition or speech synthesis.
☆40Updated 3 years ago
Alternatives and similar repositories for speech-training-recorder:
Users that are interested in speech-training-recorder are comparing it to the libraries listed below
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆112Updated 2 years ago
- Forced Alignments for Common Voice☆31Updated 4 years ago
- 🐸TTS recipes for different datasets☆87Updated 2 years ago
- Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-gramma…☆20Updated 3 years ago
- ☆39Updated last year
- scripts to align a given wave to its transcription using trained models by Kaldi☆32Updated 5 years ago
- automatically align transcribed audio and generate a wav2letter training corpus☆36Updated 2 years ago
- ☆79Updated 11 months ago
- Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.☆25Updated 2 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆102Updated 2 years ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 7 months ago
- Speaker diarization model☆27Updated 2 years ago
- Whisper combined with Silero VAD, for improved long-form transcriptions☆49Updated 2 years ago
- Tools to create your own voice dataset for TTS training☆66Updated 4 years ago
- ☆63Updated 3 weeks ago
- ☆35Updated last week
- SailAlign is an open-source software toolkit for robust long speech-text alignment implementing an adaptive, iterative speech recognition…☆98Updated 3 years ago
- ☆61Updated last year
- C++ version of pyannote audio overlapped speech detection pipeline☆13Updated last year
- This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024☆20Updated 5 months ago
- 🐸STT integration examples☆126Updated 2 years ago
- ☆54Updated last year
- Apply machine learning model DTLN for noise suppression and acoustic echo cancellation on Raspberry Pi☆66Updated 3 years ago
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆63Updated last month
- A high-quality, varied ~30hr voice dataset suitable for training a TTS model☆59Updated 2 years ago
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆88Updated last month
- Speaker change detection using SincNet and an LSTM/Transformer☆50Updated 10 months ago
- ☆28Updated last year
- A curated list of awesome voice activity detection☆50Updated 5 months ago
- python wrapper for rnnoise library☆48Updated 2 years ago