ricky0123 / vad
Voice activity detector (VAD) for the browser with a simple API
☆773Updated last month
Related projects: ⓘ
- Whisper realtime streaming for long speech-to-text transcription and translation☆1,770Updated 2 weeks ago
- Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS☆650Updated 2 months ago
- A nearly-live implementation of OpenAI's Whisper.☆1,798Updated 2 weeks ago
- A python package to build AI-powered real-time audio applications☆992Updated 2 months ago
- Multilingual Automatic Speech Recognition with word-level timestamps and confidence☆1,865Updated last month
- React hook for OpenAI Whisper with speech recorder, real-time transcription, and silence removal built-in☆707Updated 4 months ago
- Converts text to speech in realtime☆1,730Updated 3 weeks ago
- Build real-time multimodal AI applications 🤖 🎙️📹☆1,053Updated this week
- A fast multimodal LLM for real-time voice☆847Updated this week
- Silero VAD: pre-trained enterprise-grade Voice Activity Detector☆3,969Updated last week
- ☆1,079Updated 2 months ago
- Example UI implementing the RTVI web client☆468Updated last month
- ☆384Updated this week
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,509Updated last month
- Whisper with Medusa heads☆774Updated last week
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts☆262Updated 7 months ago
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆341Updated this week
- Streaming transcriber with whisper☆685Updated last year
- Transcription, forced alignment, and audio indexing with OpenAI's Whisper☆1,483Updated last week
- Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/☆774Updated 4 months ago
- ML-powered speech recognition directly in your browser☆1,473Updated 3 months ago
- Real time transcription with OpenAI Whisper.☆2,260Updated 3 months ago
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens☆421Updated 10 months ago
- Node.js bindings for OpenAI's Whisper. (C++ CPU version by ggerganov)☆223Updated last month
- ☆431Updated 2 months ago
- Local SRT/LLM/TTS Voicechat☆471Updated last month
- OpenAI Whisper ASR Webservice API☆1,975Updated last month
- Suno AI's Bark model in C/C++ for fast text-to-speech☆684Updated 2 months ago
- Real-time transcription using faster-whisper☆367Updated last month
- 💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.☆188Updated last month