coqui-ai / STT-models
Open models for Coqui STT
β136Updated last year
Alternatives and similar repositories for STT-models:
Users that are interested in STT-models are comparing it to the libraries listed below
- πΈSTT integration examplesβ127Updated 2 years ago
- On-device voice activity detection (VAD) powered by deep learningβ206Updated this week
- C++ library for converting text to phonemes for Piperβ114Updated last year
- A tokenizer, text cleaner, and phonemizer for many human languages.β309Updated 5 months ago
- Official Implementation of StyleTTSβ429Updated 3 months ago
- Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.β62Updated last year
- SEPIA server to support open-source speech recognition via WebSocket connection.β125Updated 5 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelinesβ94Updated 11 months ago
- Experiments to test different speech recognition systems for SEPIA Frameworkβ60Updated last year
- Metadata and versioning details for the Common Voice datasetβ146Updated 3 weeks ago
- πΈ - A general purpose model trainer, as flexible as it getsβ213Updated last year
- openvino version of openai/whisperβ166Updated last year
- NeMo text processing for ASR and TTSβ323Updated last week
- β353Updated last year
- A live speech recognition using Facebooks wav2vec 2.0 model.β348Updated last year
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.β135Updated last year
- Desktop application for neural speech synthesis written in C++β214Updated 2 years ago
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)β76Updated 10 months ago
- Voice models for Mimic 3 text to speech systemβ144Updated 9 months ago
- ONNX-compatible Fast SeamlessM4TβMassively Multilingual & Multimodal Machine Translationβ43Updated last year
- Model for recasing and repunctuating ASR transcriptsβ133Updated last year
- Real-Time Whisper Voice Recognition with vosk model feedback.β111Updated last year
- On-device streaming text-to-speech engine powered by deep learningβ76Updated this week
- πΈTTS recipes for different datasetsβ86Updated 2 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ147Updated 11 months ago
- π Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. π§π₯π Advanced audio processing.β243Updated 10 months ago
- πΈ collection of TTS papersβ679Updated 9 months ago
- β354Updated 7 months ago
- β187Updated 3 years ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event β¦β369Updated last year