speechsuper / SpeechSuper-API-Samples
Deep learning based speech and pronunciation assessment API for 8 languages.
☆42Updated 11 months ago
Alternatives and similar repositories for SpeechSuper-API-Samples
Users that are interested in SpeechSuper-API-Samples are comparing it to the libraries listed below
Sorting:
- A non-native English corpus for pronunciation scoring task☆132Updated 10 months ago
- Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".☆172Updated 2 years ago
- ☆37Updated last year
- ☆15Updated last month
- Fine-Tune Whisper with Transformers and PEFT☆55Updated last year
- Goodness of Pronunciation (GOP) for oral reading assessment.☆51Updated 3 years ago
- Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three differen…☆226Updated 3 years ago
- Repository for the paper: VoiceMe: Personalized voice generation in TTS☆126Updated 3 years ago
- Efficient approach to speaker diarization using voice characteristics extraction☆94Updated last year
- Spoken Language assessment☆43Updated 4 years ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆15Updated last year
- Charsiu: A neural phonetic aligner.☆299Updated 2 years ago
- Application of MB-iSTFT-VITS components to vits2_pytorch☆127Updated 5 months ago
- Barkify: an unoffical training implementation of Bark TTS by suno-ai☆129Updated last year
- Faster Tortoise inference then Tortoise Fast Fork☆128Updated last year
- This repository is the implementation of the paper, "Score-balanced Loss for Multi-aspect Pronunciation Assessment" (Interspeech 2023).☆19Updated last year
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆149Updated last year
- Multilingual G2P in 100 languages☆324Updated last year
- Timething is a library for aligning text transcripts with their audio recordings.☆119Updated 5 months ago
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆63Updated last month
- ☆92Updated 2 years ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆67Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆114Updated 2 years ago
- PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean,…☆301Updated 3 years ago
- AdaSpeech: Adaptive Text to Speech for Custom Voice☆157Updated 3 years ago
- ☆25Updated 2 years ago
- ONNX Inference of Pyannote Segmentation☆87Updated 4 months ago
- EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction☆252Updated 11 months ago
- ☆67Updated 5 months ago
- Train the next generation of TTS systems.☆165Updated 8 months ago