k-farruh / speech-accent-detectionLinks
The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.
☆62Updated 3 years ago
Alternatives and similar repositories for speech-accent-detection
Users that are interested in speech-accent-detection are comparing it to the libraries listed below
Sorting:
- Speaker change detection using SincNet and an LSTM/Transformer☆53Updated 3 months ago
- PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…☆48Updated 2 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆107Updated last week
- Goodness of Pronunciation using Kaldi on Epa-DB database☆35Updated last year
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆86Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆118Updated 2 years ago
- Collection of pretrained models for the Montreal Forced Aligner☆165Updated 3 months ago
- SelfRemaster: SSL Speech Restoration☆90Updated last year
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆91Updated 5 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆148Updated last year
- A curated list of awesome voice activity detection☆62Updated 10 months ago
- Predicts the level of noise and reverberation on your audiofiles☆163Updated 3 months ago
- Toolbox for easy and qualitative one-shot voice conversion☆46Updated 3 years ago
- A non-native English corpus for pronunciation scoring task☆151Updated last year
- ☆68Updated last year
- Code and data repository for paper "VoxCeleb enrichment for Age and Gender recognition" submitted at ASRU 2021☆69Updated 3 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆53Updated 2 years ago
- SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official code☆202Updated 3 years ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆95Updated 8 months ago
- ☆68Updated 3 months ago
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆173Updated last year
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆107Updated 2 years ago
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO☆65Updated 2 years ago
- Phoneme segmentation using pre-trained speech models☆55Updated 2 years ago
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram☆255Updated last year
- Python forced alignment☆94Updated last year
- A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project g…☆146Updated 3 years ago
- Clustering-based methods for overlapping diarization☆80Updated last year
- Various speech datasets made available to the public☆131Updated 9 months ago
- A set of audio augmentation techniques to perform noise insertion in datasets used for Automatic Speech Recognition.☆45Updated 3 years ago