xenova / whisper-web
ML-powered speech recognition directly in your browser
☆2,883Updated 6 months ago
Alternatives and similar repositories for whisper-web:
Users that are interested in whisper-web are comparing it to the libraries listed below
- Open Source framework for voice and multimodal conversational AI☆5,543Updated this week
- A fast multimodal LLM for real-time voice☆3,824Updated 2 months ago
- Fast and accurate automatic speech recognition (ASR) for edge devices☆2,672Updated last month
- A nearly-live implementation of OpenAI's Whisper.☆2,688Updated last week
- Local realtime voice AI☆2,277Updated last month
- TTS with kokoro and onnx runtime☆1,886Updated last week
- A collection of 🤗 Transformers.js demos and example applications☆1,374Updated this week
- Voice activity detector (VAD) for the browser with a simple API☆1,233Updated 2 months ago
- Whisper with Medusa heads☆830Updated last month
- Cross-Platform, GPU Accelerated Whisper 🏎️☆1,794Updated last year
- Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching☆2,347Updated last week
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆5,647Updated 8 months ago
- High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.☆5,927Updated 3 months ago
- 🔍 AI search engine - self-host with local or cloud LLMs☆3,272Updated 6 months ago
- State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!☆13,433Updated this week
- Example UI implementing the RTVI web client☆477Updated 4 months ago
- Aura is like Siri, but in your browser. An AI voice assistant optimized for low latency responses.☆1,217Updated 4 months ago
- Yes, it's another chat over documents implementation... but this one is entirely local!☆1,746Updated 3 weeks ago
- Convert any PDF into a podcast episode!☆2,210Updated 4 months ago
- Whisper realtime streaming for long speech-to-text transcription and translation☆2,729Updated 3 months ago
- ☆1,212Updated 6 months ago
- first base model for full-duplex conversational audio☆1,730Updated 3 months ago
- Fully private LLM chatbot that runs entirely with a browser with no server needed. Supports Mistral and LLama 3.☆2,606Updated 10 months ago
- A powerful framework for building realtime voice AI agents 🤖🎙️📹☆5,544Updated this week
- Inference and training library for high-quality TTS models.☆5,188Updated 4 months ago
- Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with…☆3,597Updated this week
- Document to Markdown OCR library with Llama 3.2 vision☆2,256Updated 2 months ago
- ☆1,702Updated this week
- WhisperPlus: Faster, Smarter, and More Capable 🚀☆1,814Updated last week
- An AI-powered search engine with a generative UI☆7,301Updated this week