daanzu / speech-training-recorder
Simple GUI application to help record audio dictated from given text prompts, for use with training speech recognition or speech synthesis.
☆40Updated 3 years ago
Alternatives and similar repositories for speech-training-recorder:
Users that are interested in speech-training-recorder are comparing it to the libraries listed below
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆107Updated 2 years ago
- Working online speech recognition based on RNN Transducer. ( Trained model release available in release )☆291Updated 3 years ago
- Pytorch implementation of Deepmind's WaveRNN model☆121Updated 5 years ago
- SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model☆107Updated 3 years ago
- Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow☆129Updated 3 years ago
- A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.☆38Updated 3 years ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆92Updated 4 months ago
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆88Updated last year
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆101Updated last year
- ☆39Updated last year
- Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.☆25Updated 2 years ago
- Multilingual Grapheme to Phoneme☆49Updated 8 years ago
- An online speech recognition extension toolkit of Kaldi☆56Updated 3 years ago
- STT Service based on Kaldi ASR☆15Updated 6 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆46Updated 7 months ago
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Updated last year
- pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with Transformer paper☆22Updated 2 years ago
- This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024☆18Updated 2 months ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆16Updated 11 months ago
- An unofficial PyTorch implementation of the StreamVC(Real-Time Low-Latency Voice Conversion)☆117Updated 6 months ago
- A sequence-to-sequence voice conversion toolkit.☆93Updated 7 months ago
- ☆25Updated 2 years ago
- An unofficial pytorch implementation of "STREAMVC: REAL-TIME LOW-LATENCY VOICE CONVERSION".☆61Updated 6 months ago
- Forced Alignments for Common Voice☆31Updated 4 years ago
- Generative voice cloning model using TTS synthesis with state-of-the-art Zero-Shot Multi-Speaker functionality. An web api built with the…☆47Updated 2 years ago
- Tools to create your own voice dataset for TTS training☆66Updated 4 years ago
- Predicts the level of noise and reverberation on your audiofiles☆143Updated 8 months ago
- automatically align transcribed audio and generate a wav2letter training corpus☆36Updated last year
- Efficient Speech Processing Tookit for Automatic Speaker Recognition☆17Updated 2 years ago
- On-device voice activity detection (VAD) powered by deep learning☆197Updated this week