talonvoice / wav2train
automatically align transcribed audio and generate a wav2letter training corpus
☆35Updated last year
Related projects ⓘ
Alternatives and complementary repositories for wav2train
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆100Updated last year
- Code for Speaker Change Detection in Broadcast TV using Bidirectional Long Short-Term Memory Networks☆62Updated 4 years ago
- 24-hour Automatic Speech Recognition☆27Updated 3 years ago
- ☆20Updated 6 years ago
- A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.☆14Updated 4 years ago
- Kaldi style neural network training in pytorch for use in place of nnet3 in Kaldi.☆26Updated 3 months ago
- Python library for handling audio datasets.☆131Updated last year
- ☆74Updated 3 years ago
- A "Crowd-Built" continuously growing speech dataset with transcripts. The dataset contains multiple languages and is intended for anyone …☆41Updated 2 years ago
- Adapting your own Language Model for Kaldi☆64Updated 5 years ago
- Various speech datasets made available to the public☆99Updated last month
- A handy dataset of noises for ASR☆19Updated 5 years ago
- ☆17Updated last year
- Grapheme to phoneme model for PyTorch☆40Updated 2 years ago
- FFTNet: a Real-Time Speaker-Dependent Neural Vocoder☆64Updated 6 years ago
- Code for AccentDB.☆19Updated 3 years ago
- Support tools for punctuation and boundary detection for ASR output.☆57Updated last year
- Use your data to create a speech recognition system in Kaldi. Fast.☆65Updated 4 years ago
- ☆32Updated 2 months ago
- SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model☆106Updated 3 years ago
- Python library for audio augmentation☆83Updated last year
- Speaker diarization python system based on binary key speaker modelling☆61Updated 2 years ago
- Adapt Kaldi-ASR nnet3 chain models from Zamia-Speech.org to a different language model☆34Updated 4 years ago
- Forced Alignments for Common Voice☆31Updated 4 years ago
- Articulatory features estimation using Listen Attend and Spell architecture.☆32Updated 4 years ago
- An implementation of RNN-Transducer loss in TF-2.0.☆45Updated last year
- Multistream CNN for Robust Acoustic Modeling☆39Updated 3 years ago
- maracas is a library for corrupting audio files with additive and convolutive noise.☆72Updated 7 years ago
- Long audio alignment using Kaldi☆25Updated 3 years ago
- A lightweight library to compute Diarization Error Rate (DER).☆59Updated last year