coqui-ai / STT
πΈSTT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
β2,373Updated last year
Alternatives and similar repositories for STT:
Users that are interested in STT are comparing it to the libraries listed below
- Silero VAD: pre-trained enterprise-grade Voice Activity Detectorβ5,254Updated last month
- Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)β9,725Updated last year
- Examples of how to use or integrate DeepSpeechβ842Updated last year
- TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwβ¦β959Updated this week
- Mimic Recording Studio is a Docker-based application you can install to record voice samples, which can then be trained into a TTS voice β¦β505Updated last year
- π A list of accessible speech corpora for ASR, TTS, and other Speech Technologiesβ1,316Updated 9 months ago
- A nearly-live implementation of OpenAI's Whisper.β2,586Updated 2 weeks ago
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyoneβ952Updated 4 months ago
- Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Nodeβ8,992Updated last week
- WaveRNN Vocoder + TTSβ2,153Updated 2 years ago
- eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.β4,811Updated 2 weeks ago
- End-to-End Speech Processing Toolkitβ8,872Updated this week
- A python package to build AI-powered real-time audio applicationsβ1,211Updated last month
- VOSK Speech Recognition Toolkitβ404Updated 2 years ago
- Open Text to Speech Serverβ1,016Updated 11 months ago
- WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi librariesβ1,003Updated 6 months ago
- End to end text to speech system using gruut and onnxβ824Updated last year
- WaveNet vocoderβ2,351Updated last year
- Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simpleβ5,173Updated last year
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ5,546Updated 7 months ago
- πΈ collection of TTS papersβ678Updated 8 months ago
- A fast local neural text to speech engine for Mycroftβ1,154Updated last year
- An Open Source text-to-speech system built by inverting Whisper.β4,156Updated 3 months ago
- Whisper realtime streaming for long speech-to-text transcription and translationβ2,578Updated 2 months ago
- TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, Germaβ¦β3,906Updated 8 months ago
- πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and productionβ38,513Updated 7 months ago
- DeepMind's Tacotron-2 Tensorflow implementationβ2,305Updated last year
- Multilingual Automatic Speech Recognition with word-level timestamps and confidenceβ2,302Updated last month
- A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)β2,972Updated last year
- A PyTorch-based Speech Toolkitβ9,521Updated this week