coqui-ai / STT-modelsLinks
Open models for Coqui STT
β150Updated 2 years ago
Alternatives and similar repositories for STT-models
Users that are interested in STT-models are comparing it to the libraries listed below
Sorting:
- πΈSTT integration examplesβ130Updated 3 years ago
- openvino version of openai/whisperβ180Updated 2 years ago
- πΈ - A general purpose model trainer, as flexible as it getsβ233Updated last year
- A live speech recognition using Facebooks wav2vec 2.0 model.β376Updated 2 years ago
- Voice models for Mimic 3 text to speech systemβ161Updated last year
- SEPIA server to support open-source speech recognition via WebSocket connection.β135Updated last year
- On-device voice activity detection (VAD) powered by deep learningβ243Updated 2 weeks ago
- A tokenizer, text cleaner, and phonemizer for many human languages.β332Updated last year
- C++ library for converting text to phonemes for Piperβ138Updated 6 months ago
- Real-Time Whisper Voice Recognition with vosk model feedback.β121Updated 2 years ago
- β359Updated last year
- A curated list of awesome voice activity detectionβ71Updated last year
- Official Implementation of StyleTTSβ460Updated last year
- Desktop application for neural speech synthesis written in C++β212Updated last week
- Experiments to test different speech recognition systems for SEPIA Frameworkβ63Updated 2 years ago
- ONNX Inference of Pyannote Segmentationβ97Updated last year
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ154Updated last year
- β258Updated last year
- How to create your own model for voskβ75Updated 4 years ago
- Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.β215Updated last year
- π€ Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillationβ260Updated 2 months ago
- NeMo text processing for ASR and TTSβ418Updated last week
- On-device noise suppression powered by deep learningβ82Updated 2 weeks ago
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated textsβ348Updated last year
- Grapheme to phoneme conversion with deep learning.β420Updated 2 years ago
- β204Updated 3 years ago
- Finetune VITS and MMS using HuggingFace's toolsβ189Updated last year
- Tunable pipelinesβ41Updated 4 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelinesβ100Updated last year
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.β137Updated 2 years ago