coqui-ai / STTLinks
πΈSTT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
β2,471Updated last year
Alternatives and similar repositories for STT
Users that are interested in STT are comparing it to the libraries listed below
Sorting:
- Examples of how to use or integrate DeepSpeechβ854Updated last year
- A python package to analyze and compare voices with deep learningβ3,033Updated last year
- π A list of accessible speech corpora for ASR, TTS, and other Speech Technologiesβ1,343Updated last year
- Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)β9,908Updated last year
- Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simpleβ5,387Updated last year
- Python interface to the WebRTC Voice Activity Detectorβ2,296Updated last year
- π A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).β1,967Updated last year
- TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwβ¦β984Updated last month
- Silero VAD: pre-trained enterprise-grade Voice Activity Detectorβ6,289Updated last month
- WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi librariesβ1,125Updated last month
- Open Text to Speech Serverβ1,079Updated last year
- A python package to build AI-powered real-time audio applicationsβ1,363Updated 5 months ago
- TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, Germaβ¦β3,949Updated last year
- A fast local neural text to speech engine for Mycroftβ1,204Updated 3 months ago
- Mimic Recording Studio is a Docker-based application you can install to record voice samples, which can then be trained into a TTS voice β¦β511Updated 2 years ago
- Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speakerβ¦β7,881Updated last week
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ5,840Updated 11 months ago
- A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.β1,769Updated 9 months ago
- YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyoneβ995Updated 8 months ago
- Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license sβ¦β627Updated 6 months ago
- An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.β838Updated last year
- On-device streaming speech-to-text engine powered by deep learningβ633Updated 2 weeks ago
- VOSK Speech Recognition Toolkitβ456Updated 3 years ago
- An opensource text-to-speech (TTS) voice building toolβ677Updated 11 months ago
- Noise supression using deep filteringβ3,178Updated 9 months ago
- πΈ collection of TTS papersβ705Updated last year
- Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Eβ¦β1,799Updated 2 years ago
- An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"β2,058Updated last year
- End-to-End Speech Processing Toolkitβ9,293Updated last week
- Unified-Modal Speech-Text Pre-Training for Spoken Language Processingβ1,379Updated last year