snakers4/silero-models

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/snakers4/silero-models)

snakers4 / silero-models

Silero Models: pre-trained text-to-speech models made embarrassingly simple

☆6,022

Alternatives and similar repositories for silero-models

Users that are interested in silero-models are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

snakers4 / silero-vad
View on GitHub
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
☆9,661Jul 16, 2026Updated last week
coqui-ai / TTS
View on GitHub
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
☆45,801Aug 16, 2024Updated last year
snakers4 / open_stt
View on GitHub
Open STT
☆826Mar 11, 2022Updated 4 years ago
speechbrain / speechbrain
View on GitHub
A PyTorch-based Speech Toolkit
☆11,711Jun 15, 2026Updated last month
neonbjb / tortoise-tts
View on GitHub
A multi-voice TTS system trained with an emphasis on quality
☆14,865Nov 19, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
espnet / espnet
View on GitHub
End-to-End Speech Processing Toolkit
☆9,902Updated this week
snakers4 / russian_stt_text_normalization
View on GitHub
Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks
☆122Mar 15, 2021Updated 5 years ago
NVIDIA-NeMo / Speech
View on GitHub
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Auto…
☆17,814Updated this week
alphacep / vosk-api
View on GitHub
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
☆14,980Jul 2, 2026Updated 3 weeks ago
pyannote / pyannote-audio
View on GitHub
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…
☆10,327Updated this week
yl4579 / StyleTTS2
View on GitHub
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
☆6,316Aug 10, 2024Updated last year
WhisperSpeech / WhisperSpeech
View on GitHub
An Open Source text-to-speech system built by inverting Whisper.
☆4,624Dec 14, 2025Updated 7 months ago
suno-ai / bark
View on GitHub
🔊 Text-Prompted Generative Audio Model
☆39,214Aug 19, 2024Updated last year
mozilla / TTS
View on GitHub
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
☆10,164Nov 9, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
m-bain / whisperX
View on GitHub
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆23,229Jul 13, 2026Updated last week
rhasspy / piper
View on GitHub
A fast, local neural text to speech system
☆11,257Aug 26, 2025Updated 10 months ago
ai-forever / ru-gpts
View on GitHub
Russian GPT3 models.
☆2,089Dec 12, 2022Updated 3 years ago
openai / whisper
View on GitHub
Robust Speech Recognition via Large-Scale Weak Supervision
☆105,506Apr 15, 2026Updated 3 months ago
ggml-org / whisper.cpp
View on GitHub
Port of OpenAI's Whisper model in C/C++
☆52,259Jul 11, 2026Updated 2 weeks ago
huggingface / parler-tts
View on GitHub
Inference and training library for high-quality TTS models.
☆5,581Dec 10, 2024Updated last year
facebookresearch / fairseq
View on GitHub
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
☆32,251Sep 30, 2025Updated 9 months ago
yandex / YaLM-100B
View on GitHub
Pretrained language model with 100B parameters
☆3,756Jul 10, 2023Updated 3 years ago
coqui-ai / STT
View on GitHub
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
☆2,595Mar 11, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
janvarev / Irene-Voice-Assistant
View on GitHub
Ирина - русский голосовой ассистент для работы оффлайн. Поддерживает скиллы через плагины.
☆1,148Updated this week
SYSTRAN / faster-whisper
View on GitHub
Faster Whisper transcription with CTranslate2
☆24,503Nov 19, 2025Updated 8 months ago
huggingface / distil-whisper
View on GitHub
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
☆4,097Jan 8, 2025Updated last year
NATSpeech / NATSpeech
View on GitHub
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and Diff…
☆1,004Apr 2, 2023Updated 3 years ago
facebookresearch / denoiser
View on GitHub
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech E…
☆1,904Mar 14, 2023Updated 3 years ago
DigitalPhonetics / IMS-Toucan
View on GitHub
Controllable and fast Text-to-Speech for over 7000 languages!
☆2,207Jan 25, 2026Updated 6 months ago
alphacep / vosk-tts
View on GitHub
Text To Speech Synthesis with Vosk
☆268Jun 6, 2026Updated last month
CorentinJ / Real-Time-Voice-Cloning
View on GitHub
Clone a voice in 5 seconds to generate arbitrary speech in real-time
☆60,056Mar 9, 2026Updated 4 months ago
coqui-ai / open-speech-corpora
View on GitHub
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
☆1,397Jun 6, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
mozilla / DeepSpeech
View on GitHub
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Ras…
☆26,775Jun 19, 2025Updated last year
RHVoice / RHVoice
View on GitHub
a free and open source speech synthesizer for Russian and other languages
☆1,824Updated this week
lhotse-speech / lhotse
View on GitHub
Tools for handling multimodal data in machine learning projects.
☆1,143Jun 22, 2026Updated last month
facebookresearch / audiocraft
View on GitHub
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…
☆23,519Mar 3, 2026Updated 4 months ago
rhasspy / gruut
View on GitHub
A tokenizer, text cleaner, and phonemizer for many human languages.
☆330Nov 15, 2024Updated last year
myshell-ai / OpenVoice
View on GitHub
Instant voice cloning by MIT and MyShell. Audio foundation model.
☆37,012Apr 19, 2025Updated last year
facebookresearch / encodec
View on GitHub
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
☆4,002Jan 4, 2024Updated 2 years ago