k-farruh / speech-accent-detectionLinks
The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.
☆62Updated 3 years ago
Alternatives and similar repositories for speech-accent-detection
Users that are interested in speech-accent-detection are comparing it to the libraries listed below
Sorting:
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆53Updated last month
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆149Updated last year
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆170Updated last month
- Reproducible experimental protocols for multimedia (audio, video, text) database☆104Updated 5 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆116Updated 2 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆102Updated 2 years ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆165Updated 2 years ago
- Goodness of Pronunciation using Kaldi on Epa-DB database☆35Updated last year
- A non-native English corpus for pronunciation scoring task☆143Updated last year
- Putting flows on top of neural transducers for better TTS☆62Updated 3 weeks ago
- SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official code☆202Updated 2 years ago
- Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".☆178Updated 2 years ago
- Python forced alignment☆91Updated last year
- Collection of pretrained models for the Montreal Forced Aligner☆156Updated last month
- Deep Learning model for lexical stress detection in spoken English☆29Updated 5 years ago
- Various speech datasets made available to the public☆123Updated 7 months ago
- A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project g…☆146Updated 3 years ago
- ☆69Updated last month
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆260Updated 6 months ago
- Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.☆25Updated 2 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆88Updated last year
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆82Updated 2 years ago
- Repository for Accent Recognition (Hackathon @SLT2022)☆32Updated last year
- Neural HMMs are all you need (for high-quality attention-free TTS)☆158Updated 3 weeks ago
- Finetuning VITS Efficiently☆33Updated last year
- ☆22Updated 10 months ago
- Advanced data structures for handling temporal segments with attached labels.☆114Updated 5 months ago
- Predicts the level of noise and reverberation on your audiofiles☆153Updated 3 weeks ago
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram☆251Updated 11 months ago