snakers4 / silero-modelsLinks
Silero Models: pre-trained text-to-speech models made embarrassingly simple
β5,542Updated last week
Alternatives and similar repositories for silero-models
Users that are interested in silero-models are comparing it to the libraries listed below
Sorting:
- Silero VAD: pre-trained enterprise-grade Voice Activity Detectorβ7,238Updated last week
- πΈSTT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.β2,529Updated last year
- A multi-voice TTS system trained with an emphasis on qualityβ14,682Updated 11 months ago
- An Open Source text-to-speech system built by inverting Whisper.β4,518Updated 5 months ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Modelsβ6,034Updated last year
- A fast, local neural text to speech systemβ10,202Updated 2 months ago
- A fast local neural text to speech engine for Mycroftβ1,233Updated 7 months ago
- End to end text to speech system using gruut and onnxβ831Updated 2 years ago
- Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.β3,976Updated 10 months ago
- A python package to analyze and compare voices with deep learningβ3,141Updated 2 years ago
- Open STTβ808Updated 3 years ago
- Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)β10,045Updated last year
- Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speakerβ¦β8,610Updated 2 weeks ago
- WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi librariesβ1,194Updated 3 months ago
- Noise supression using deep filteringβ3,477Updated last year
- π Text-prompted Generative Audio Model - With the ability to clone voicesβ3,332Updated 2 months ago
- A PyTorch-based Speech Toolkitβ10,683Updated last week
- Python interface to the WebRTC Voice Activity Detectorβ2,389Updated last year
- JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.β4,637Updated last year
- π A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).β2,065Updated last year
- Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Eβ¦β1,835Updated 2 years ago
- Multilingual Automatic Speech Recognition with word-level timestamps and confidenceβ2,647Updated 2 months ago
- Phoneme multilingual(Russian-English) voice cloning based onβ395Updated 4 years ago
- Open Text to Speech Serverβ1,110Updated last year
- β2,566Updated this week
- AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advβ¦β2,112Updated 3 months ago
- A python package to build AI-powered real-time audio applicationsβ1,497Updated 8 months ago
- Transcription, forced alignment, and audio indexing with OpenAI's Whisperβ2,060Updated last week
- Unified-Modal Speech-Text Pre-Training for Spoken Language Processingβ1,403Updated last year
- π A list of accessible speech corpora for ASR, TTS, and other Speech Technologiesβ1,364Updated last year