esnya / realtime-whisper
ASR (Automatic Speech Recognition) for real-time streamed audio powered by Whisper and tranformers
☆28Updated 3 months ago
Alternatives and similar repositories for realtime-whisper:
Users that are interested in realtime-whisper are comparing it to the libraries listed below
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆87Updated 6 months ago
- We Speech Transcript based on LLM, in 300 lines of code.☆149Updated 3 weeks ago
- A lightweight end-to-end text-to-speech model☆110Updated last month
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆87Updated last year
- Python的音频工具☆12Updated 4 months ago
- 基于FunASR实现语音识别,包含常规版和ONNX版(推荐)。☆36Updated 5 months ago
- Grapheme-to-Phoneme lexicons for Chinese dialects☆67Updated 2 years ago
- Speech Diarization for scrum automation☆102Updated last year
- ☆26Updated 3 weeks ago
- Port of Funasr's Paraformer model in C/C++☆30Updated 9 months ago
- ☆31Updated 3 weeks ago
- ONNX Inference of Pyannote Segmentation☆81Updated 3 months ago
- Utilizes ONNX Runtime to transcribe audio into text.☆18Updated last month
- 简单实现VAD+声纹锁+SenseVoice完成类语音实时转录的小项目☆19Updated 6 months ago
- Streaming ASR and TTS based on FastAPI+ sherpa-onnx☆86Updated 5 months ago
- 用于SenseVoice的api项目,输出带时间戳字幕☆34Updated 5 months ago
- paraformer(chinense asr) online onnx runtime for python☆41Updated last year
- ASR 2Pass onnxruntime and websocket server, based on FunASR(https://github.com/alibaba-damo-academy/FunASR).☆63Updated 2 months ago
- Python Wrapper of Silero VAD☆48Updated 3 months ago
- flow mirror models from JZX AI Labs☆43Updated 5 months ago
- 一个简单的音频降噪工具,提高web UI界面和api接口☆23Updated 4 months ago
- ChatTTS is a generative speech model for daily dialogue.☆14Updated 5 months ago
- Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3☆396Updated 6 months ago
- Bert-VITS2 onnx推理版本☆41Updated 11 months ago
- 🌻 VITS ONNX TTS server designed for fast inference 🔥☆127Updated last month
- ☆77Updated last year
- Transferability of cross-lingual and cross-age speech emotion recognition☆18Updated last year
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆93Updated 3 months ago
- Running the F5-TTS by ONNX Runtime☆129Updated this week
- Repository for the paper: VoiceMe: Personalized voice generation in TTS☆126Updated 2 years ago