k-farruh / speech-accent-detectionLinks

The human speaks a language with an accent. A particular accent necessarily reflects a person's linguistic background. The model defines accent based audio record. The result of the model could be used to determine accents and help decrease accents to English learning students and improve accents by training.

☆62

Alternatives and similar repositories for speech-accent-detection

Users that are interested in speech-accent-detection are comparing it to the libraries listed below

Sorting:

keonlee9420 / Comprehensive-Tacotron2
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supp…
☆48Updated 2 years ago
espnet / notebook
☆69Updated last month
jimbozhang / speechocean762
A non-native English corpus for pronunciation scoring task
☆144Updated last year
pyannote / pyannote-database
Reproducible experimental protocols for multimedia (audio, video, text) database
☆106Updated 5 months ago
Appen / UHV-OTS-Speech
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
☆101Updated 2 years ago
HHousen / speaker-change-detection
Speaker change detection using SincNet and an LSTM/Transformer
☆53Updated 2 months ago
FrenchKrab / IS2023-powerset-diarization
Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.
☆88Updated last year
MontrealCorpusTools / mfa-models
Collection of pretrained models for the Montreal Forced Aligner
☆158Updated last month
SolomidHero / real-time-voice-conversion
Toolbox for easy and qualitative one-shot voice conversion
☆45Updated 3 years ago
YuanGongND / gopt
Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".
☆179Updated 2 years ago
bigcash / awesome-vad
A curated list of awesome voice activity detection
☆59Updated 8 months ago
jasonppy / PromptingWhisper
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
☆148Updated last year
jumon / whisper-punctuator
Zero-shot multimodal punctuation insertion and truecasing using Whisper
☆116Updated 2 years ago
JazminVidal / gop-dnn-epadb
Goodness of Pronunciation using Kaldi on Epa-DB database
☆35Updated last year
xinjli / transphone
phoneme tokenizer and grapheme-to-phoneme model for 8k languages
☆168Updated 2 years ago
Wadaboa / titanet
Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO
☆64Updated 2 years ago
resemble-ai / monotonic_align
Monotonic Alignment Search
☆96Updated last month
shivammehta25 / OverFlow
Putting flows on top of neural transducers for better TTS
☆62Updated last month
RF5 / simple-speaker-embedding
A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.
☆91Updated 4 months ago
yerfor / SyntaSpeech
SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official code
☆201Updated 2 years ago
unilight / seq2seq-vc
A sequence-to-sequence voice conversion toolkit.
☆102Updated last year
tzyll / goparrot
Goodness of Pronunciation (GOP) for oral reading assessment.
☆52Updated 3 years ago
roedoejet / g2p
Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!
☆171Updated last month
hitz-zentroa / whisper-lm
Add n-gram and large language model (LLM) support to Whisper models.
☆31Updated 3 months ago
revdotcom / speech-datasets
Various speech datasets made available to the public
☆126Updated 7 months ago
tango4j / Auto-Tuning-Spectral-Clustering
This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"
☆121Updated 3 years ago
vectominist / MiniASR
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
☆54Updated 2 years ago
keonlee9420 / Comprehensive-E2E-TTS
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project g…
☆146Updated 3 years ago
mtkresearch / clairaudience
Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)
☆27Updated last year
roatienza / efficientspeech
PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.
☆174Updated last year