kosuke-kitahara / xlsr-wav2vec2-phoneme-recognition
☆28Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for xlsr-wav2vec2-phoneme-recognition
- [Interspeech22]Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Ass…☆22Updated 10 months ago
- Compendium for the paper "Transparent pronunciation scoring using articulatorily weighted phoneme edit distance" by Karhila, Smolander, Y…☆25Updated 5 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆47Updated last year
- Phoneme segmentation using pre-trained speech models☆54Updated 2 years ago
- ☆63Updated last month
- Phoneme Boundary Detection using Learnable Segmental Features (ICASSP 2020)☆79Updated 3 years ago
- End-to-End Mispronunciation Detection via wav2vec2.0☆42Updated 2 years ago
- A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augment Techniques☆57Updated 3 years ago
- multilingual speech aligner☆72Updated last year
- ☆33Updated 3 years ago
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆64Updated 3 years ago
- Transformer implementation speciaized in speech recognition tasks using Pytorch.☆64Updated 2 years ago
- BERT and LSTM baseline models of the ZeroSpeech Challenge 2021☆57Updated 2 years ago
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆47Updated last year
- Goodness of Pronunciation using Kaldi on Epa-DB database☆33Updated 10 months ago
- Rescoring methods for end-to-end Automatic Speech Recognition☆27Updated 4 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆44Updated 4 months ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆71Updated 3 years ago
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO☆58Updated 2 years ago
- pytorch implementation for MultiSpeech: Multi-Speaker Text to Speech with Transformer paper☆19Updated 2 years ago
- An unofficial implementation of https://arxiv.org/abs/2005.05106☆46Updated 3 years ago
- These are Jupyter Notebooks to help guide people to learn how to use Praat-Parselmouth☆37Updated 3 years ago
- ☆40Updated 2 years ago
- ☆28Updated 2 years ago
- An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".☆33Updated 3 years ago
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆57Updated 4 years ago
- ☆25Updated 2 years ago
- Toolbox for easy and qualitative one-shot voice conversion☆45Updated 2 years ago
- Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]☆26Updated 3 years ago
- A new metric for evaluating end-to-end speech recognition and disfluency removal systems☆19Updated 3 years ago