coqui-ai / STTLinks
πΈSTT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
β2,546Updated last year
Alternatives and similar repositories for STT
Users that are interested in STT are comparing it to the libraries listed below
Sorting:
- Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)β10,089Updated 2 years ago
- A fast local neural text to speech engine for Mycroftβ1,242Updated 9 months ago
- Examples of how to use or integrate DeepSpeechβ856Updated 2 years ago
- π A list of accessible speech corpora for ASR, TTS, and other Speech Technologiesβ1,376Updated last year
- Open Text to Speech Serverβ1,117Updated last year
- A python package to analyze and compare voices with deep learningβ3,196Updated 2 years ago
- Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license sβ¦β678Updated last week
- On-device streaming speech-to-text engine powered by deep learningβ647Updated this week
- Silero Models: pre-trained text-to-speech models made embarrassingly simpleβ5,687Updated 3 weeks ago
- Unified-Modal Speech-Text Pre-Training for Spoken Language Processingβ1,419Updated last year
- Mimic Recording Studio is a Docker-based application you can install to record voice samples, which can then be trained into a TTS voice β¦β509Updated 2 years ago
- Silero VAD: pre-trained enterprise-grade Voice Activity Detectorβ7,761Updated this week
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyoneβ1,045Updated last year
- An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.β842Updated 2 years ago
- Whisper realtime streaming for long speech-to-text transcription and translationβ3,494Updated last month
- WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi librariesβ1,217Updated 5 months ago
- TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwβ¦β1,000Updated 6 months ago
- A python package to build AI-powered real-time audio applicationsβ1,904Updated 10 months ago
- π€π¬ Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.β1,159Updated last year
- Multilingual Automatic Speech Recognition with word-level timestamps and confidenceβ2,712Updated 3 months ago
- An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"β2,133Updated 2 years ago
- A lightweight, simple-to-use, RNN wake word listenerβ951Updated 2 years ago
- A nearly-live implementation of OpenAI's Whisper.β3,692Updated 3 months ago
- TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, Germaβ¦β3,990Updated last year
- Python interface to the WebRTC Voice Activity Detectorβ2,416Updated last year
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ6,104Updated last year
- On-device Speech-to-Intent engine powered by deep learningβ695Updated this week
- πΈ collection of TTS papersβ720Updated last year
- WaveRNN Vocoder + TTSβ2,177Updated 3 years ago
- eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.β5,978Updated 2 weeks ago