daanzu / speech-training-recorder
Simple GUI application to help record audio dictated from given text prompts, for use with training speech recognition or speech synthesis.
☆40Updated 3 years ago
Alternatives and similar repositories for speech-training-recorder:
Users that are interested in speech-training-recorder are comparing it to the libraries listed below
- Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.☆25Updated 2 years ago
- A converter from Arpabet to IPA (see https://en.wikipedia.org/wiki/Arpabet)☆17Updated 7 years ago
- On-device voice activity detection (VAD) powered by deep learning☆192Updated 2 weeks ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆107Updated last year
- Pytorch implementation of Deepmind's WaveRNN model☆121Updated 5 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆100Updated last year
- This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024☆18Updated 2 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆90Updated 3 months ago
- a python library for speech enhancement☆77Updated 7 months ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆94Updated 2 weeks ago
- An unofficial PyTorch implementation of the StreamVC(Real-Time Low-Latency Voice Conversion)☆117Updated 6 months ago
- automatically align transcribed audio and generate a wav2letter training corpus☆36Updated last year
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models☆148Updated last year
- Speaker diarization model☆23Updated last year
- python wrapper for rnnoise library☆45Updated 2 years ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆16Updated 11 months ago
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆88Updated last year
- An unofficial pytorch implementation of "STREAMVC: REAL-TIME LOW-LATENCY VOICE CONVERSION".☆58Updated 6 months ago
- Scripts to simplify data prepping for Mozilla DeepSpeech.☆14Updated 5 years ago
- ☆34Updated 4 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆145Updated 8 months ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆135Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆46Updated 7 months ago
- [Last Updated 2021] TTS from Cookie. Messy and experimental!☆43Updated last year
- SailAlign is an open-source software toolkit for robust long speech-text alignment implementing an adaptive, iterative speech recognition…☆97Updated 2 years ago
- scripts to align a given wave to its transcription using trained models by Kaldi☆32Updated 5 years ago
- VoicePAT is a modular and efficient toolkit for voice privacy research, with main focus on speaker anonymization.☆48Updated 8 months ago
- ☆19Updated last year
- Forced Alignments for Common Voice☆31Updated 4 years ago
- Working online speech recognition based on RNN Transducer. ( Trained model release available in release )☆291Updated 3 years ago