coqui-ai / STT-examples
πΈSTT integration examples
β127Updated 2 years ago
Alternatives and similar repositories for STT-examples:
Users that are interested in STT-examples are comparing it to the libraries listed below
- Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.β205Updated 9 months ago
- On-device voice activity detection (VAD) powered by deep learningβ206Updated last week
- A tokenizer, text cleaner, and phonemizer for many human languages.β310Updated 5 months ago
- Open models for Coqui STTβ137Updated last year
- πΈTTS recipes for different datasetsβ87Updated 2 years ago
- Model for recasing and repunctuating ASR transcriptsβ133Updated last year
- DeepSpeech based forced alignment toolβ237Updated 4 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ148Updated 11 months ago
- Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpusβ170Updated 4 months ago
- [WIP] VoiceSmith makes training text to speech models easy.β224Updated 2 years ago
- Various speech datasets made available to the publicβ116Updated 4 months ago
- Reproducible experimental protocols for multimedia (audio, video, text) databaseβ100Updated 2 months ago
- Grapheme to phoneme conversion with deep learning.β381Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisperβ112Updated 2 years ago
- Linguistic processing for Common Voiceβ55Updated last year
- VCTK multi-speaker tacotron for ICASSP 2020β266Updated 3 years ago
- How to create your own model for voskβ72Updated 3 years ago
- Python server for communicating with Kaldi from the browser using WebRTCβ69Updated last year
- Desktop application for neural speech synthesis written in C++β215Updated 2 years ago
- A live speech recognition using Facebooks wav2vec 2.0 model.β352Updated last year
- β39Updated last year
- Provides training, inference and voice conversion recipes for RADTTS and RADTTS++: Flow-based TTS models with Robust Alignment Learning, β¦β286Updated 2 years ago
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervβ¦β137Updated 4 months ago
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogramβ248Updated 9 months ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.β102Updated 2 years ago
- Pytorch implementation of Deepmind's WaveRNN modelβ121Updated 5 years ago
- PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modelingβ191Updated 3 years ago
- This repository is a collection of TTS Models in TFLiteβ192Updated 4 years ago
- Segment an audio file and obtain utterance alignments. (Python package)β335Updated 11 months ago
- PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speechβ230Updated 2 years ago