ricky0123/vad

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ricky0123/vad)

ricky0123 / vad

Voice activity detector (VAD) for the browser with a simple API

☆2,014

Alternatives and similar repositories for vad

Users that are interested in vad are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

snakers4 / silero-vad
View on GitHub
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
☆9,557Jul 3, 2026Updated last week
ai-ng / swift
View on GitHub
Fast voice assistant powered by Groq, Cartesia, and Vercel.
☆593Dec 4, 2025Updated 7 months ago
pipecat-ai / pipecat
View on GitHub
Open Source framework for voice and multimodal conversational AI
☆13,258Updated this week
botany-labs / voice-ai-js-starter
View on GitHub
Starter project for building real-time AI Voice Assistants
☆42Sep 24, 2024Updated last year
SYSTRAN / faster-whisper
View on GitHub
Faster Whisper transcription with CTranslate2
☆24,170Nov 19, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
DictationDaddy / VAD_WEB_DEMO
View on GitHub
In this repository, I show you how to use SILERO VAD with ONNX-WEB runtime to run the VAD compeletely in the browser.
☆26Dec 28, 2024Updated last year
ufal / whisper_streaming
View on GitHub
Whisper realtime streaming for long speech-to-text transcription and translation
☆3,647Nov 12, 2025Updated 7 months ago
m-bain / whisperX
View on GitHub
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆23,009Jun 26, 2026Updated 2 weeks ago
pipecat-ai / smart-turn
View on GitHub
☆1,459Jan 29, 2026Updated 5 months ago
collabora / WhisperLive
View on GitHub
A nearly-live implementation of OpenAI's Whisper.
☆4,119Updated this week
fixie-ai / ultravox
View on GitHub
A fast multimodal LLM for real-time voice
☆4,470Dec 12, 2025Updated 6 months ago
huggingface / transformers.js
View on GitHub
State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
☆16,170Jun 24, 2026Updated 2 weeks ago
ianmarmour / speech-detector
View on GitHub
Local voice activity detection of PCM audio streams using Silero VAD
☆11Nov 9, 2023Updated 2 years ago
livekit / agents
View on GitHub
A framework for building realtime voice AI agents 🤖🎙️📹
☆11,279Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
pyannote / pyannote-audio
View on GitHub
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…
☆10,255Jul 2, 2026Updated last week
yl4579 / StyleTTS2
View on GitHub
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
☆6,301Aug 10, 2024Updated last year
kyutai-labs / moshi
View on GitHub
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…
☆10,512May 16, 2026Updated last month
cartesia-ai / cartesia-js
View on GitHub
The JavaScript client for the Cartesia API.
☆133Jul 1, 2026Updated last week
ggml-org / whisper.cpp
View on GitHub
Port of OpenAI's Whisper model in C/C++
☆51,528Jul 1, 2026Updated last week
hanifabd / voice-activity-detection-vad-realtime
View on GitHub
Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)
☆115Aug 18, 2025Updated 10 months ago
chengsokdara / use-whisper
View on GitHub
React hook for OpenAI Whisper with speech recorder, real-time transcription, and silence removal built-in
☆785Apr 30, 2024Updated 2 years ago
Snirpo / node-vad
View on GitHub
Voice activation detection library for NodeJS
☆67Oct 18, 2019Updated 6 years ago
KoljaB / RealtimeTTS
View on GitHub
Converts text to speech in realtime
☆3,979May 31, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
KoljaB / RealtimeSTT
View on GitHub
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcri…
☆9,955Jun 12, 2026Updated 3 weeks ago
WhisperSpeech / WhisperSpeech
View on GitHub
An Open Source text-to-speech system built by inverting Whisper.
☆4,623Dec 14, 2025Updated 6 months ago
pipecat-ai / rtvi-web-demo
View on GitHub
Example UI implementing the RTVI web client
☆473Dec 3, 2024Updated last year
k2-fsa / sherpa-onnx
View on GitHub
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime…
☆13,499Updated this week
xenova / whisper-web
View on GitHub
ML-powered speech recognition directly in your browser
☆3,335Oct 1, 2024Updated last year
livekit / livekit
View on GitHub
End-to-end realtime stack for connecting humans and AI
☆19,705Updated this week
coqui-ai / TTS
View on GitHub
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
☆45,705Aug 16, 2024Updated last year
serenadeai / webrtcvad
View on GitHub
webrtcvad provides node.js bindings to the WebRTC voice activity detection library.
☆36Dec 6, 2020Updated 5 years ago
openai / openai-realtime-console
View on GitHub
React app for inspecting, building and debugging with the Realtime API
☆3,603Aug 28, 2025Updated 10 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
myshell-ai / MeloTTS
View on GitHub
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
☆7,523Dec 24, 2024Updated last year
deepgram-devs / nextjs-live-transcription
View on GitHub
Get started using Deepgram's Live Transcription with this Next.js demo app
☆271Jul 2, 2026Updated last week
FunAudioLLM / SenseVoice
View on GitHub
Multilingual speech understanding: ASR + emotion recognition + audio event detection. 50+ languages, 15x faster than Whisper, non-autoreg…
☆8,786Jun 29, 2026Updated last week
SWivid / F5-TTS
View on GitHub
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
☆14,904Updated this week
deepgram-devs / deepgram-ai-agent-demo
View on GitHub
☆409Jan 7, 2026Updated 6 months ago
davabase / whisper_real_time
View on GitHub
Real time transcription with OpenAI Whisper.
☆2,938Apr 15, 2025Updated last year
canopyai / Orpheus-TTS
View on GitHub
Towards Human-Sounding Speech
☆6,231Dec 5, 2025Updated 7 months ago