DictationDaddy / VAD_WEB_DEMOLinks
In this repository, I show you how to use SILERO VAD with ONNX-WEB runtime to run the VAD compeletely in the browser.
☆24Updated 10 months ago
Alternatives and similar repositories for VAD_WEB_DEMO
Users that are interested in VAD_WEB_DEMO are comparing it to the libraries listed below
Sorting:
- CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search☆65Updated 3 months ago
- Whisper realtime streaming for long speech-to-text transcription and translation☆121Updated last year
- Real-Time Voice Inference Web SDK☆287Updated last week
- An JS web client for connecting to Pipecat bots with voice and vision☆45Updated 10 months ago
- LiveKit real-time and server SDKs for Python☆288Updated last week
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)☆101Updated 2 months ago
- A WebRTC server that allows you to interact with an LLM using your speech and responds back with generated audio.☆138Updated last year
- faster-whisper as serverless endpoint☆121Updated 5 months ago
- ☆306Updated last week
- Thin wrapper around OpenAI Whisper API with streaming support☆89Updated 9 months ago
- Real-time voice agent powered by Agora and OpenAI☆96Updated 2 months ago
- ☆27Updated 2 years ago
- Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote☆229Updated 8 months ago
- Daily Bots Web Demo showcasing how to build real-time voice AI agents☆246Updated last month
- React / Vanilla JS Text to Speech with highlighting the words and sentences that are being spoken using audio files, text to speech API, …☆175Updated this week
- A lightweight end-of-utterance detection model fine-tuned on SmolLM2-135M, optimized for Raspberry Pi and low-power devices.☆34Updated 6 months ago
- ☆348Updated last year
- SemanticFinder - frontend-only live semantic search with transformers.js☆302Updated 7 months ago
- Record and stream WAV audio data in the browser across all platforms☆91Updated 11 months ago
- Official open source React components and examples for building with LiveKit.☆337Updated this week
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆72Updated last year
- Pybind11 bindings for Whisper.cpp☆340Updated 10 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆97Updated last year
- Play with OpenAI's new Realtime API in your browser☆338Updated last month
- Open source inference code for Rev's model☆432Updated 6 months ago
- Example UI implementing the RTVI web client☆476Updated 10 months ago
- 💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.☆217Updated last year
- Buildings block for voice-enabled applications in the browser☆37Updated 6 months ago
- ASR + diarization model server with speculative decoding☆63Updated last year
- Starter project for building real-time AI Voice Assistants☆42Updated last year