k-farruh / speech-accent-detection
The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.
☆57Updated 3 years ago
Alternatives and similar repositories for speech-accent-detection:
Users that are interested in speech-accent-detection are comparing it to the libraries listed below
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆46Updated 7 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆79Updated last year
- ☆34Updated 3 years ago
- SelfRemaster: SSL Speech Restoration☆88Updated last year
- AdaSpeech: Adaptive Text to Speech for Custom Voice☆156Updated 3 years ago
- Repository for Accent Recognition (Hackathon @SLT2022)☆25Updated 9 months ago
- Python forced alignment☆84Updated 10 months ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆50Updated 2 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated last year
- Convert English text from written expressions into spoken forms☆24Updated 2 years ago
- ☆66Updated 2 months ago
- English conversation corpus for conversational TTS.☆20Updated last year
- Various speech datasets made available to the public☆113Updated 2 months ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆79Updated 10 months ago
- Monotonic Alignment Search☆89Updated 2 years ago
- Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.☆35Updated last year
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆233Updated last month
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆78Updated last month
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆118Updated 2 years ago
- Fine-Tune Whisper with Transformers and PEFT☆49Updated last year
- 56 language, 1 model Multilingual ASR☆24Updated 3 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆96Updated this week
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆101Updated last year
- ☆67Updated 3 weeks ago
- ☆38Updated this week
- Code and data repository for paper "VoxCeleb enrichment for Age and Gender recognition" submitted at ASRU 2021☆67Updated 3 years ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆49Updated last year
- SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis☆120Updated last month
- Putting flows on top of neural transducers for better TTS☆61Updated last week