KoljaB/RealtimeSTT

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/KoljaB/RealtimeSTT)

KoljaB / RealtimeSTT

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

☆9,983

Alternatives and similar repositories for RealtimeSTT

Users that are interested in RealtimeSTT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

KoljaB / RealtimeTTS
View on GitHub
Converts text to speech in realtime
☆3,986May 31, 2026Updated last month
KoljaB / Linguflex
View on GitHub
Command Your World with Voice
☆813Jun 17, 2025Updated last year
fixie-ai / ultravox
View on GitHub
A fast multimodal LLM for real-time voice
☆4,475Dec 12, 2025Updated 7 months ago
SYSTRAN / faster-whisper
View on GitHub
Faster Whisper transcription with CTranslate2
☆24,276Nov 19, 2025Updated 7 months ago
OpenBMB / MiniCPM-V
View on GitHub
A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone
☆25,880Jun 25, 2026Updated 2 weeks ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
fishaudio / fish-speech
View on GitHub
SOTA Open Source TTS
☆31,259Jun 9, 2026Updated last month
unclecode / crawl4ai
View on GitHub
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
☆72,613Updated this week
myshell-ai / OpenVoice
View on GitHub
Instant voice cloning by MIT and MyShell. Audio foundation model.
☆36,936Apr 19, 2025Updated last year
stanford-oval / storm
View on GitHub
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
☆30,074Sep 30, 2025Updated 9 months ago
livekit / agents
View on GitHub
A framework for building realtime voice AI agents 🤖🎙️📹
☆11,360Updated this week
ufal / whisper_streaming
View on GitHub
Whisper realtime streaming for long speech-to-text transcription and translation
☆3,648Nov 12, 2025Updated 8 months ago
agno-agi / agno
View on GitHub
Build, run, and manage your own agent platform.
☆41,166Updated this week
coqui-ai / TTS
View on GitHub
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
☆45,750Aug 16, 2024Updated last year
Canner / WrenAI
View on GitHub
GenBI (Generative BI) for AI agents, an open-source, governed text-to-SQL through an open context layer that turns natural-language quest…
☆15,808Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
collabora / WhisperLive
View on GitHub
A nearly-live implementation of OpenAI's Whisper.
☆4,130Updated this week
m-bain / whisperX
View on GitHub
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆23,066Updated this week
khoj-ai / khoj
View on GitHub
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. …
☆35,656Jun 24, 2026Updated 2 weeks ago
datalab-to / surya
View on GitHub
OCR, layout analysis, reading order, table recognition in 90+ languages
☆21,093Updated this week
gradio-app / fastrtc
View on GitHub
The python library for real-time communication
☆4,618Jan 12, 2026Updated 6 months ago
TEN-framework / ten-framework
View on GitHub
Open-source framework for conversational voice AI agents
☆10,900Updated this week
pipecat-ai / pipecat
View on GitHub
Open Source framework for voice and multimodal conversational AI
☆13,418Updated this week
screenpipe / screenpipe
View on GitHub
YC (S26) | AI that knows what you've seen, said, or heard. Records everything you do, say, hear 24/7, local, private, secure. Connect to …
☆19,845Updated this week
mem0ai / mem0
View on GitHub
Universal memory layer for AI Agents
☆60,815Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
TabbyML / tabby
View on GitHub
Self-hosted AI coding assistant
☆33,696Jun 30, 2026Updated 2 weeks ago
exo-explore / exo
View on GitHub
Run frontier AI locally.
☆46,258Jun 23, 2026Updated 3 weeks ago
unslothai / unsloth
View on GitHub
Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.
☆68,204Updated this week
Cinnamon / kotaemon
View on GitHub
An open-source RAG-based tool for chatting with your documents.
☆25,545Jun 9, 2026Updated last month
snakers4 / silero-vad
View on GitHub
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
☆9,584Jul 3, 2026Updated last week
k2-fsa / sherpa-onnx
View on GitHub
Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime…
☆13,558Updated this week
KoljaB / RealtimeVoiceChat
View on GitHub
Have a natural, spoken conversation with AI!
☆3,797Jul 11, 2025Updated last year
OpenHands / OpenHands
View on GitHub
🙌 OpenHands: AI-Driven Development
☆80,766Updated this week
ItzCrazyKns / Vane
View on GitHub
Vane is an AI-powered answering engine.
☆35,666Apr 11, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
open-mmlab / Amphion
View on GitHub
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…
☆9,932Mar 25, 2026Updated 3 months ago
huggingface / parler-tts
View on GitHub
Inference and training library for high-quality TTS models.
☆5,583Dec 10, 2024Updated last year
myshell-ai / MeloTTS
View on GitHub
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
☆7,536Dec 24, 2024Updated last year
Mintplex-Labs / anything-llm
View on GitHub
Stop renting your intelligence. Own it with AnythingLLM. Everything you need for a powerful local-first agent experience
☆63,282Updated this week
browser-use / browser-use
View on GitHub
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
☆104,742Updated this week
BerriAI / litellm
View on GitHub
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing a…
☆53,568Updated this week
Skyvern-AI / skyvern
View on GitHub
Automate browser based workflows with AI
☆22,222Updated this week