esnya/realtime-whisper

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/esnya/realtime-whisper)

esnya / realtime-whisper

ASR (Automatic Speech Recognition) for real-time streamed audio powered by Whisper and tranformers

☆36

Alternatives and similar repositories for realtime-whisper

Users that are interested in realtime-whisper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jundaychan / funasr-fastapi
View on GitHub
funasr语音转文字的简单api版本，funasr+fastapi，方便部署在服务器上
☆13Aug 10, 2024Updated last year
cyhuang-tw / robust-vc
View on GitHub
☆11May 7, 2022Updated 4 years ago
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
pengzhendong / streaming-asr
View on GitHub
One command to start a streaming ASR server.
☆12Oct 2, 2024Updated last year
shtoshni / g2p
View on GitHub
Code for SLT 2016 paper on Grapheme-to-Phoneme conversion using attention based encoder-decoder models
☆15Feb 20, 2019Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
k2-fsa / sherpa-mlx
View on GitHub
sherpa with mlx
☆15Aug 2, 2025Updated 11 months ago
jzshq208886 / wenet_asr
View on GitHub
☆12Jul 11, 2024Updated 2 years ago
HaujetZhao / FunASR-Online-Paraformer-Test
View on GitHub
☆52Nov 26, 2023Updated 2 years ago
LidaGuo1999 / BUAA_Course
View on GitHub
北京航空航天大学课程资料共享仓库
☆10Apr 21, 2019Updated 7 years ago
lukeewin / ASR_LLM_TTS_Front
View on GitHub
ASR_LLM_TTS前端项目
☆15Dec 3, 2024Updated last year
Bartelds / ctc-dro
View on GitHub
Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.
☆17May 16, 2025Updated last year
catcto / CosyVoiceDocker
View on GitHub
This repository provides a Docker image for CosyVoice
☆27Dec 22, 2024Updated last year
Gelelmaster / Funasr-Qwen-GPTSovits
View on GitHub
<综合> Funasr语音识别，调用Qwen大模型回答，通过GPTSovits输出语音的ai程序，其中调用模型还是在线，后续将添加离线大模型
☆13Nov 30, 2024Updated last year
NTIA / alignnet
View on GitHub
Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.
☆18Aug 1, 2025Updated 11 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Miamoto / Conformer-NTM
View on GitHub
☆16Nov 9, 2023Updated 2 years ago
ysngki / XMoE
View on GitHub
☆15Oct 19, 2024Updated last year
SandyPanda-MLDL / -Evaluation-Metrics-Used-For-The-Performance-Evaluation-of-Voice-Conversion-VC-Models
View on GitHub
Evaluation Metrics Used For The Performance Evaluation of Voice Conversion (VC) Models
☆19Jul 8, 2025Updated last year
k2-fsa / colab
View on GitHub
Colab notebooks for Next-gen Kaldi
☆31Oct 12, 2025Updated 9 months ago
MorenoLaQuatra / vad
View on GitHub
Simple voice activity detection (VAD) algorithm in Python
☆15Aug 10, 2023Updated 2 years ago
Haste171 / llamaindex-retrieval-api
View on GitHub
API to load and query documents using RAG
☆14Sep 25, 2023Updated 2 years ago
OlaWod / PitchVC
View on GitHub
PitchVC: Pitch Conditioned Any-to-Many Voice Conversion
☆35Jun 6, 2024Updated 2 years ago
lovemefan / paraformer-python
View on GitHub
paraformer(chinense asr) online onnx runtime for python
☆54Mar 27, 2024Updated 2 years ago
RemSynch / SenseVoice-Real-Time
View on GitHub
简单实现VAD+声纹锁+SenseVoice完成类语音实时转录的小项目
☆42Sep 23, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
SoonSYJ / fawasr
View on GitHub
FunASR安卓端侧离线版本2pass全模式
☆15Sep 4, 2023Updated 2 years ago
voxos-ai / streaming-whisper-server
View on GitHub
A streaming whisper server for on-prem transcription
☆23Aug 15, 2024Updated last year
ZachHandley / LlamaIndexAPI
View on GitHub
A Docker image with Llama Index, Lang Chain, and a few other popular AI packages installed by default
☆11Nov 19, 2025Updated 8 months ago
tuanio / conformer-rnnt
View on GitHub
Conformer RNN-Transducer
☆14May 25, 2022Updated 4 years ago
peilongchencc / My-FunASR
View on GitHub
基于FunASR实现语音识别，包含常规版和ONNX版(推荐)。
☆53Oct 12, 2024Updated last year
zhoutuan / mod_funasr
View on GitHub
FreeSWITCH ASR module fork from mod_audio_stream， use FunASR online cpu version
☆20Jun 27, 2025Updated last year
QuadraV-Speech / funasr_seaco_paraformer_onnx_with_timestamp
View on GitHub
修复funasr中seaco-paraformer导出onnx后没有时间戳的bug
☆25Sep 12, 2024Updated last year
Mddct / simple-tts
View on GitHub
（WIP）long form speech generatoins
☆30Apr 2, 2025Updated last year
frankyoujian / Edge-Punct-Casing
View on GitHub
☆33Feb 4, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
zenforic / csm-multi
View on GitHub
Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…
☆26Mar 28, 2025Updated last year
jianyangshi / funasr-android
View on GitHub
funasr-android 本地化部署生成.so给安卓那边调用
☆50Mar 20, 2025Updated last year
BrightGu / SingleVC
View on GitHub
Any-to-one voice conversion using the data augment strategy: pitch shifted and duration remained.
☆34Jan 10, 2022Updated 4 years ago
sahand995 / Informed-RRT-Star
View on GitHub
Informed Rapidly-exploring Random Tree-Star with C# Programming
☆10Nov 6, 2021Updated 4 years ago
pengzhendong / compute-wer
View on GitHub
Compute WER and SER for speech recognition evaluation
☆27Jun 6, 2026Updated last month
lukeewin / ASR_LLM_TTS
View on GitHub
This is a web-based intelligent dialogue program built using ASR, LLM, and TTS.
☆25Dec 3, 2024Updated last year
marcinmatys / whisper_streaming
View on GitHub
Whisper realtime streaming for long speech-to-text transcription and translation
☆22Nov 4, 2024Updated last year