ufal/whisper_streaming

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ufal/whisper_streaming)

ufal / whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation

☆3,660

Alternatives and similar repositories for whisper_streaming

Users that are interested in whisper_streaming are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

collabora / WhisperLive
View on GitHub
A nearly-live implementation of OpenAI's Whisper.
☆4,202Updated this week
SYSTRAN / faster-whisper
View on GitHub
Faster Whisper transcription with CTranslate2
☆24,704Nov 19, 2025Updated 8 months ago
davabase / whisper_real_time
View on GitHub
Real time transcription with OpenAI Whisper.
☆2,940Apr 15, 2025Updated last year
m-bain / whisperX
View on GitHub
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆23,393Jul 13, 2026Updated 2 weeks ago
snakers4 / silero-vad
View on GitHub
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
☆9,840Jul 16, 2026Updated 2 weeks ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
juanmc2005 / diart
View on GitHub
A python package to build AI-powered real-time audio applications
☆2,009Jun 19, 2026Updated last month
speaches-ai / speaches
View on GitHub
☆3,558Updated this week
alesaccoia / VoiceStreamAI
View on GitHub
Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS
☆959Oct 2, 2024Updated last year
luweigen / whisper_streaming
View on GitHub
Whisper realtime streaming for long speech-to-text transcription and translation
☆121Jan 29, 2024Updated 2 years ago
ufal / SimulStreaming
View on GitHub
☆648Jul 12, 2026Updated 3 weeks ago
pyannote / pyannote-audio
View on GitHub
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…
☆10,371Jul 24, 2026Updated last week
linto-ai / whisper-timestamped
View on GitHub
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
☆2,835Sep 9, 2025Updated 10 months ago
huggingface / distil-whisper
View on GitHub
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
☆4,102Jan 8, 2025Updated last year
backspacetg / simul_whisper
View on GitHub
Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection
☆113Mar 30, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
OpenNMT / CTranslate2
View on GitHub
Fast inference engine for Transformer models
☆4,606Jul 3, 2026Updated 3 weeks ago
KoljaB / RealtimeSTT
View on GitHub
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcri…
☆10,024Jun 12, 2026Updated last month
KoljaB / RealtimeTTS
View on GitHub
Converts text to speech in realtime
☆4,003Updated this week
Vaibhavs10 / insanely-fast-whisper
View on GitHub
☆13,026Oct 25, 2025Updated 9 months ago
MahmoudAshraf97 / whisper-diarization
View on GitHub
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
☆5,617Feb 23, 2026Updated 5 months ago
ggml-org / whisper.cpp
View on GitHub
Port of OpenAI's Whisper model in C/C++
☆52,518Updated this week
QuentinFuxa / WhisperLiveKit
View on GitHub
Simultaneous speech-to-text models
☆10,572Updated this week
nyrahealth / CrisperWhisper
View on GitHub
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
☆1,110Updated this week
kyutai-labs / moshi
View on GitHub
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…
☆10,785May 16, 2026Updated 2 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
openai / whisper
View on GitHub
Robust Speech Recognition via Large-Scale Weak Supervision
☆106,447Updated this week
reriiasu / speech-to-text
View on GitHub
Real-time transcription using faster-whisper
☆615Jul 23, 2024Updated 2 years ago
WhisperSpeech / WhisperSpeech
View on GitHub
An Open Source text-to-speech system built by inverting Whisper.
☆4,630Dec 14, 2025Updated 7 months ago
modelscope / FunASR
View on GitHub
Open-source speech recognition toolkit for training, inference, streaming ASR, VAD, punctuation, speaker diarization pipelines, and OpenA…
☆19,596Updated this week
coqui-ai / TTS
View on GitHub
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
☆45,851Aug 16, 2024Updated last year
nalbion / whisper-server
View on GitHub
streaming speech to text server using Whisper
☆103Jun 2, 2023Updated 3 years ago
speechbrain / speechbrain
View on GitHub
A PyTorch-based Speech Toolkit
☆11,730Jun 15, 2026Updated last month
mldljyh / whisper_real_time_translation
View on GitHub
The subtitles and translations are generated in real-time and displayed as pop-ups.
☆197Jun 8, 2023Updated 3 years ago
QwenAudio / SenseVoice
View on GitHub
Open-source SenseVoiceSmall model for Mandarin, Cantonese, English, Japanese, and Korean ASR, language ID, emotion recognition, and audio…
☆8,981Updated this week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
yl4579 / StyleTTS2
View on GitHub
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
☆6,327Aug 10, 2024Updated last year
collabora / WhisperFusion
View on GitHub
WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
☆1,647Jul 31, 2024Updated 2 years ago
huggingface / parler-tts
View on GitHub
Inference and training library for high-quality TTS models.
☆5,582Dec 10, 2024Updated last year
yeyupiaoling / Whisper-Finetune
View on GitHub
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training wit…
☆1,220May 8, 2026Updated 2 months ago
aiola-lab / whisper-medusa
View on GitHub
Whisper with Medusa heads
☆861Jul 2, 2026Updated last month
facebookresearch / seamless_communication
View on GitHub
Foundational Models for State-of-the-Art Speech and Text Translation
☆11,829Updated this week
sanchit-gandhi / whisper-jax
View on GitHub
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
☆4,684Apr 3, 2024Updated 2 years ago