snakers4/silero-vad

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/snakers4/silero-vad)

snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

☆9,748

Alternatives and similar repositories for silero-vad

Users that are interested in silero-vad are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pyannote / pyannote-audio
View on GitHub
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…
☆10,338Updated this week
SYSTRAN / faster-whisper
View on GitHub
Faster Whisper transcription with CTranslate2
☆24,567Nov 19, 2025Updated 8 months ago
TEN-framework / ten-vad
View on GitHub
Voice Activity Detector (VAD) : low-latency, high-performance and lightweight
☆2,210Feb 2, 2026Updated 5 months ago
ricky0123 / vad
View on GitHub
Voice activity detector (VAD) for the browser with a simple API
☆2,024Jan 30, 2026Updated 5 months ago
modelscope / FunASR
View on GitHub
Open-source speech recognition toolkit for training, inference, streaming ASR, VAD, punctuation, speaker diarization pipelines, and OpenA…
☆19,503Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
m-bain / whisperX
View on GitHub
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆23,291Jul 13, 2026Updated 2 weeks ago
speechbrain / speechbrain
View on GitHub
A PyTorch-based Speech Toolkit
☆11,715Jun 15, 2026Updated last month
wiseman / py-webrtcvad
View on GitHub
Python interface to the WebRTC Voice Activity Detector
☆2,495Jul 4, 2024Updated 2 years ago
k2-fsa / sherpa-onnx
View on GitHub
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime…
☆13,819Updated this week
QwenAudio / SenseVoice
View on GitHub
Open-source SenseVoiceSmall model for Mandarin, Cantonese, English, Japanese, and Korean ASR, language ID, emotion recognition, and audio…
☆8,940Updated this week
snakers4 / silero-models
View on GitHub
Silero Models: pre-trained text-to-speech models made embarrassingly simple
☆6,028Jun 4, 2026Updated last month
modelscope / 3D-Speaker
View on GitHub
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
☆3,072Dec 8, 2025Updated 7 months ago
espnet / espnet
View on GitHub
End-to-End Speech Processing Toolkit
☆9,904Updated this week
wenet-e2e / wespeaker
View on GitHub
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
☆1,369Jul 8, 2026Updated 2 weeks ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Rikorose / DeepFilterNet
View on GitHub
Noise supression using deep filtering
☆4,514Oct 17, 2024Updated last year
pipecat-ai / smart-turn
View on GitHub
☆1,487Jan 29, 2026Updated 5 months ago
ggml-org / whisper.cpp
View on GitHub
Port of OpenAI's Whisper model in C/C++
☆52,349Jul 11, 2026Updated 2 weeks ago
wenet-e2e / wenet
View on GitHub
Production First and Production Ready End-to-End Speech Recognition Toolkit
☆5,210Jun 15, 2026Updated last month
ufal / whisper_streaming
View on GitHub
Whisper realtime streaming for long speech-to-text transcription and translation
☆3,656Nov 12, 2025Updated 8 months ago
QwenAudio / CosyVoice
View on GitHub
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
☆22,441May 25, 2026Updated 2 months ago
openai / whisper
View on GitHub
Robust Speech Recognition via Large-Scale Weak Supervision
☆105,641Apr 15, 2026Updated 3 months ago
modelscope / ClearerVoice-Studio
View on GitHub
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Spe…
☆4,340Aug 14, 2025Updated 11 months ago
coqui-ai / TTS
View on GitHub
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
☆45,830Aug 16, 2024Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
pipecat-ai / pipecat
View on GitHub
Open Source framework for voice and multimodal conversational AI
☆13,747Updated this week
NVIDIA-NeMo / Speech
View on GitHub
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Auto…
☆17,824Updated this week
k2-fsa / icefall
View on GitHub
☆1,465Jul 16, 2026Updated last week
fishaudio / fish-speech
View on GitHub
SOTA Open Source TTS
☆31,388Updated this week
lhotse-speech / lhotse
View on GitHub
Tools for handling multimodal data in machine learning projects.
☆1,143Jun 22, 2026Updated last month
kyutai-labs / moshi
View on GitHub
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…
☆10,740May 16, 2026Updated 2 months ago
k2-fsa / k2
View on GitHub
FSA/FST algorithms, differentiable, with PyTorch compatibility.
☆1,348Jul 11, 2026Updated 2 weeks ago
juanmc2005 / diart
View on GitHub
A python package to build AI-powered real-time audio applications
☆2,005Jun 19, 2026Updated last month
yl4579 / StyleTTS2
View on GitHub
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
☆6,320Aug 10, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
FireRedTeam / FireRedASR
View on GitHub
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR be…
☆1,942Feb 25, 2026Updated 5 months ago
OpenNMT / CTranslate2
View on GitHub
Fast inference engine for Transformer models
☆4,592Jul 3, 2026Updated 3 weeks ago
facebookresearch / omnilingual-asr
View on GitHub
Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages
☆2,861Dec 30, 2025Updated 6 months ago
facebookresearch / denoiser
View on GitHub
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech E…
☆1,904Mar 14, 2023Updated 3 years ago
huggingface / distil-whisper
View on GitHub
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
☆4,099Jan 8, 2025Updated last year
xiph / rnnoise
View on GitHub
Recurrent neural network for audio noise reduction
☆5,747Feb 22, 2025Updated last year
huggingface / speech-to-speech
View on GitHub
Build local voice agents with open-source models
☆6,565Updated this week