k-farruh / speech-accent-detectionLinks
The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.
☆61Updated 3 years ago
Alternatives and similar repositories for speech-accent-detection
Users that are interested in speech-accent-detection are comparing it to the libraries listed below
Sorting:
- Speaker change detection using SincNet and an LSTM/Transformer☆52Updated last month
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Updated last year
- ☆40Updated last year
- A non-native English corpus for pronunciation scoring task☆143Updated 11 months ago
- Putting flows on top of neural transducers for better TTS☆62Updated this week
- ☆22Updated 10 months ago
- A curated list of awesome voice activity detection☆57Updated 7 months ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆102Updated 4 months ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆94Updated 5 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆84Updated last year
- Estimating the Age, Height, and Gender of a speaker with their speech signal. https://arxiv.org/pdf/2110.13653.pdf☆65Updated 4 years ago
- SelfRemaster: SSL Speech Restoration☆89Updated last year
- Add n-gram and large language model (LLM) support to Whisper models.☆26Updated last month
- ☆67Updated 2 weeks ago
- Repository for Accent Recognition (Hackathon @SLT2022)☆32Updated last year
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆92Updated last year
- Spot the conversation: speaker diarisation in the wild☆141Updated 2 years ago
- Phoneme segmentation using pre-trained speech models☆55Updated 2 years ago
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆258Updated 5 months ago
- Various speech datasets made available to the public☆122Updated 6 months ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated 2 years ago
- ☆61Updated last year
- Toolbox for easy and qualitative one-shot voice conversion☆45Updated 3 years ago
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆88Updated 2 months ago
- ☆54Updated last year
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆138Updated 3 months ago
- ☆66Updated 9 months ago
- A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.☆43Updated 3 years ago
- Monotonic Alignment Search☆94Updated 2 weeks ago
- Voice Activity Projection Models: Self-supervised learning of Turn-taking Events☆69Updated last year