k-farruh / speech-accent-detection
The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.
☆60Updated 3 years ago
Alternatives and similar repositories for speech-accent-detection
Users that are interested in speech-accent-detection are comparing it to the libraries listed below
Sorting:
- Speaker change detection using SincNet and an LSTM/Transformer☆51Updated 10 months ago
- Python forced alignment☆89Updated last year
- Collection of pretrained models for the Montreal Forced Aligner☆148Updated 10 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆83Updated last year
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Updated last year
- AdaSpeech: Adaptive Text to Speech for Custom Voice☆157Updated 3 years ago
- A non-native English corpus for pronunciation scoring task☆132Updated 10 months ago
- Google's SoundStorm: Efficient Parallel Audio Generation☆132Updated last year
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆160Updated last year
- Finetuning VITS Efficiently☆32Updated last year
- Putting flows on top of neural transducers for better TTS☆62Updated last month
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆63Updated last month
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆153Updated 2 months ago
- Various speech datasets made available to the public☆117Updated 5 months ago
- Phoneme alignment representation compatible with multiple forced aligners☆21Updated last year
- SelfRemaster: SSL Speech Restoration☆88Updated last year
- An unofficial PyTorch implementation of VALL-E☆87Updated last week
- ☆66Updated 8 months ago
- Train the next generation of TTS systems.☆165Updated 8 months ago
- This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"☆132Updated last year
- ☆25Updated 2 years ago
- Predicts the level of noise and reverberation on your audiofiles☆149Updated 11 months ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆194Updated 8 months ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆162Updated 3 weeks ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆88Updated 4 months ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆100Updated 3 months ago
- Goodness of Pronunciation using Kaldi on Epa-DB database☆35Updated last year
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆158Updated last year
- A sequence-to-sequence voice conversion toolkit.☆97Updated 10 months ago
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆119Updated 2 years ago