KoljaB / RealtimeSTT
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
☆1,641Updated last week
Related projects: ⓘ
- Converts text to speech in realtime☆1,730Updated 3 weeks ago
- Whisper realtime streaming for long speech-to-text transcription and translation☆1,770Updated 2 weeks ago
- Command Your World with Voice☆368Updated 3 weeks ago
- A nearly-live implementation of OpenAI's Whisper.☆1,798Updated 2 weeks ago
- Local AI talk with a custom voice based on Zephyr 7B model. Uses RealtimeSTT with faster_whisper for transcription and RealtimeTTS with C…☆475Updated last month
- Real time transcription with OpenAI Whisper.☆2,260Updated 3 months ago
- MARS5 speech model (TTS) from CAMB.AI☆2,440Updated last month
- Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper☆3,315Updated 2 weeks ago
- 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production☆341Updated this week
- Multilingual and Controllable Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart.☆1,374Updated this week
- ☆1,079Updated 2 months ago
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,509Updated last month
- Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS☆650Updated 2 months ago
- Multilingual Automatic Speech Recognition with word-level timestamps and confidence☆1,865Updated last month
- Build real-time multimodal AI applications 🤖🎙️📹☆1,053Updated this week
- AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of adv…☆871Updated this week
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆4,714Updated last month
- Foundational model for human-like, expressive TTS☆3,721Updated last month
- An Open Source text-to-speech system built by inverting Whisper.☆3,772Updated 3 months ago
- Speech To Speech: an effort for an open-sourced and modular GPT4-o☆2,989Updated last week
- Inference and training library for high-quality TTS models.☆4,193Updated last month
- A python package to build AI-powered real-time audio applications☆992Updated 2 months ago
- Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.☆3,495Updated 2 months ago
- Whisper with Medusa heads☆774Updated last week
- Silero VAD: pre-trained enterprise-grade Voice Activity Detector☆3,969Updated last week
- ☆384Updated this week
- A fast multimodal LLM for real-time voice☆847Updated this week
- ☆486Updated 4 months ago
- Open Source framework for voice and multimodal conversational AI☆3,044Updated this week
- A fast, local neural text to speech system☆5,776Updated last month