k-farruh / speech-accent-detection
The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.
☆60Updated 3 years ago
Alternatives and similar repositories for speech-accent-detection:
Users that are interested in speech-accent-detection are comparing it to the libraries listed below
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆148Updated 11 months ago
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆50Updated 9 months ago
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆120Updated 2 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆102Updated 2 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆81Updated last year
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆158Updated last week
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆246Updated 3 months ago
- Putting flows on top of neural transducers for better TTS☆62Updated 3 weeks ago
- SelfRemaster: SSL Speech Restoration☆88Updated last year
- SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis☆130Updated 3 months ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆100Updated 2 months ago
- Repository for Accent Recognition (Hackathon @SLT2022)☆27Updated 11 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆112Updated 2 years ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 6 months ago
- Predicts the level of noise and reverberation on your audiofiles☆148Updated 11 months ago
- OpenAI Whisper Prompt Examples☆52Updated last year
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated last year
- ☆66Updated 4 months ago
- Python forced alignment☆87Updated last year
- AdaSpeech: Adaptive Text to Speech for Custom Voice☆157Updated 3 years ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆159Updated last year
- Monotonic Alignment Search☆91Updated 2 years ago
- ☆21Updated 8 months ago
- A sequence-to-sequence voice conversion toolkit.☆97Updated 9 months ago
- TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion☆146Updated last year
- Various speech datasets made available to the public☆116Updated 4 months ago
- Collection of pretrained models for the Montreal Forced Aligner☆143Updated 9 months ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆87Updated last year
- Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.☆88Updated 2 years ago