k-farruh / speech-accent-detectionLinks
The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.
☆62Updated 4 years ago
Alternatives and similar repositories for speech-accent-detection
Users that are interested in speech-accent-detection are comparing it to the libraries listed below
Sorting:
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Updated 2 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆107Updated this week
- Speaker change detection using SincNet and an LSTM/Transformer☆56Updated 6 months ago
- Goodness of Pronunciation using Kaldi on Epa-DB database☆35Updated last year
- A curated list of awesome voice activity detection☆69Updated last year
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆107Updated 2 years ago
- A non-native English corpus for pronunciation scoring task☆161Updated last month
- Various speech datasets made available to the public☆129Updated 11 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆153Updated last year
- Putting flows on top of neural transducers for better TTS☆64Updated this week
- Collection of pretrained models for the Montreal Forced Aligner☆178Updated 2 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 2 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆52Updated 3 years ago
- ☆67Updated 6 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆91Updated 2 years ago
- Fine-Tune Whisper with Transformers and PEFT☆58Updated 2 years ago
- Python forced alignment☆94Updated last year
- Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".☆194Updated 2 years ago
- Monotonic Alignment Search☆100Updated 6 months ago
- SelfRemaster: SSL Speech Restoration☆93Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆151Updated last year
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆100Updated 8 months ago
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆90Updated 8 months ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆172Updated 2 years ago
- Phoneme segmentation using pre-trained speech models☆55Updated 3 years ago
- Toolbox for easy and qualitative one-shot voice conversion☆46Updated 4 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated 2 years ago
- Goodness of Pronunciation (GOP) for oral reading assessment.☆52Updated 4 years ago
- OpenAI Whisper Prompt Examples☆52Updated 2 years ago
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆124Updated 3 years ago