k-farruh / speech-accent-detectionLinks
The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.
☆62Updated 4 years ago
Alternatives and similar repositories for speech-accent-detection
Users that are interested in speech-accent-detection are comparing it to the libraries listed below
Sorting:
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 2 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆108Updated 3 weeks ago
- Goodness of Pronunciation using Kaldi on Epa-DB database☆35Updated last year
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Updated 2 years ago
- A non-native English corpus for pronunciation scoring task☆161Updated 2 months ago
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram☆265Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆151Updated last year
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆92Updated 2 years ago
- A curated list of awesome voice activity detection☆70Updated last year
- Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".☆194Updated 2 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆107Updated 2 years ago
- Speaker change detection using SincNet and an LSTM/Transformer☆56Updated 7 months ago
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆264Updated 11 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆153Updated last year
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆172Updated 2 years ago
- Collection of pretrained models for the Montreal Forced Aligner☆181Updated 2 months ago
- PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised T…☆194Updated 3 years ago
- This project is about performing Speaker diarization for Hindi Language.☆58Updated 4 years ago
- [Interspeech22]Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Ass…☆34Updated last year
- Monotonic Alignment Search☆100Updated 6 months ago
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO☆67Updated 3 years ago
- ☆67Updated 6 months ago
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆90Updated 9 months ago
- SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official code☆203Updated 3 years ago
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆124Updated 3 years ago
- ☆69Updated last year
- Putting flows on top of neural transducers for better TTS☆64Updated 3 weeks ago
- PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean,…☆314Updated 4 years ago
- Predicts the level of noise and reverberation on your audiofiles☆173Updated 6 months ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆100Updated 11 months ago