DictationDaddy / VAD_WEB_DEMO
In this repository, I show you how to use SILERO VAD with ONNX-WEB runtime to run the VAD compeletely in the browser.
☆21Updated 3 months ago
Alternatives and similar repositories for VAD_WEB_DEMO:
Users that are interested in VAD_WEB_DEMO are comparing it to the libraries listed below
- Whisper realtime streaming for long speech-to-text transcription and translation☆113Updated last year
- An JS web client for connecting to Pipecat bots with voice and vision☆44Updated 4 months ago
- Real-Time Voice Inference Web SDK☆219Updated last week
- Daily Bots Web Demo showcasing how to build real-time voice AI agents☆233Updated 5 months ago
- An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.☆40Updated this week
- ☆224Updated this week
- Record and stream WAV audio data in the browser across all platforms☆80Updated 5 months ago
- A WebRTC server that allows you to interact with an LLM using your speech and responds back with generated audio.☆130Updated 10 months ago
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆53Updated 6 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆94Updated 11 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- Play with OpenAI's new Realtime API in your browser☆320Updated 4 months ago
- CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search☆60Updated last year
- A simple voice assistant example built with Next.js and LiveKit React Components☆140Updated this week
- Faster Whisper ASR transcription with CTranslate2☆20Updated 5 months ago
- faster-whisper as serverless endpoint☆95Updated this week
- TTS support with GGML☆28Updated this week
- This is a repository that collects common audio noise reduction models, using Gradio to demonstrate the use of each model, which is very …☆34Updated 4 months ago
- Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote☆200Updated 2 months ago
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆104Updated 2 weeks ago
- ☆11Updated last month
- Have a natural voice conversation with an LLM☆247Updated 4 months ago
- ASR + diarization model server with speculative decoding☆60Updated 11 months ago
- Cog wrapper for Coqui / xtts-v2☆74Updated 4 months ago
- SemanticFinder - frontend-only live semantic search with transformers.js☆268Updated 3 weeks ago
- G2P☆210Updated last week
- A high-throughput and memory-efficient inference and serving engine for Whisper, https://mesolitica.com/blog/vllm-whisper☆25Updated 8 months ago
- TypeScript-based library for real-time audio transcription, integrating OpenAI's Whisper model for accurate speech-to-text conversion.☆68Updated last year
- Create Animated Subtitles From .SRT files in Remotion☆49Updated last year
- Speech Diarization for scrum automation☆102Updated last year