coqui-ai / STT-examples
πΈSTT integration examples
β121Updated 2 years ago
Related projects β
Alternatives and complementary repositories for STT-examples
- A tokenizer, text cleaner, and phonemizer for many human languages.β285Updated this week
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ141Updated 6 months ago
- Open models for Coqui STTβ122Updated last year
- On-device voice activity detection (VAD) powered by deep learningβ179Updated this week
- A live speech recognition using Facebooks wav2vec 2.0 model.β328Updated 9 months ago
- Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpusβ165Updated 4 months ago
- Segment an audio file and obtain utterance alignments. (Python package)β321Updated 6 months ago
- ESPnet Model Zooβ245Updated last year
- β251Updated last year
- Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.β200Updated 3 months ago
- β34Updated 9 months ago
- VCTK multi-speaker tacotron for ICASSP 2020β265Updated 2 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisperβ99Updated last year
- π€ Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillationβ240Updated last year
- This repository is a collection of TTS Models in TFLiteβ189Updated 3 years ago
- PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speechβ224Updated 2 years ago
- Model for recasing and repunctuating ASR transcriptsβ129Updated 7 months ago
- A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised durationβ¦β323Updated 2 years ago
- Pytorch implementation of Deepmind's WaveRNN modelβ121Updated 5 years ago
- Open tools and data for cloudless automatic speech recognitionβ443Updated 3 years ago
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team atβ¦β371Updated 3 weeks ago
- DeepSpeech based forced alignment toolβ234Updated 3 years ago
- Grapheme to phoneme conversion with deep learning.β358Updated 11 months ago
- openvino version of openai/whisperβ161Updated last year
- PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modelingβ189Updated 3 years ago
- Advanced data structures for handling temporal segments with attached labels.β99Updated 5 months ago
- Desktop application for neural speech synthesis written in C++β210Updated last year
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.