DigitLib / whisper-webui-vad
This is the combined forks of two repos to enable OpenAI Whisper large image with VAD for low VRAM GPUs.
☆33Updated 2 years ago
Alternatives and similar repositories for whisper-webui-vad
Users that are interested in whisper-webui-vad are comparing it to the libraries listed below
Sorting:
- Robust Speech Recognition via Large-Scale Weak Supervision☆30Updated last year
- ☆12Updated last year
- ☆83Updated 10 months ago
- This project provides a Flask-based API for generating high-quality text-to-speech (TTS) audio using F5-TTS, a flexible and powerful TTS …☆12Updated last month
- A browser interface based on the Gradio library for OpenAI's Whisper model.☆42Updated last year
- faster-whisper livestream translation, OBS noise reduction, dual language subtitles☆78Updated 2 years ago
- ☆47Updated last year
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆94Updated last year
- Towards Robust Blind Face Restoration with Codebook Lookup Transformer☆30Updated last year
- ☆54Updated last year
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆60Updated 6 months ago
- ☆22Updated last year
- ☆27Updated last year
- RTVC: Real-Time Voice Conversion GUI☆55Updated last year
- Speech AI training and inference tools☆35Updated last year
- web based editor for subtitles and transcripts☆130Updated 9 months ago
- Whisper combined with Silero VAD, for improved long-form transcriptions☆50Updated 2 years ago
- Listen to any audio stream on your machine and print out the transcribed or translated audio.☆119Updated last year
- 1 min voice data can also be used to train a good TTS model! (few shot voice cloning)☆25Updated last week
- ☆8Updated last year
- ☆13Updated last year
- Archived 🚧|🌻Building ChatBot with LLMs.🌻 | Using async requests. | 具有多 LLM 适应性 | 通用大语言模型代理端框架 |多人称全类型注解☆40Updated last year
- SadTalker gradio_demo.py file with code section that allows you to set the eye blink and pose reference videos for the software to use wh…☆11Updated last year
- ☆40Updated last year
- AI 3D avatar voice interface in browser. VAD -> STT -> LLM -> TTS -> VRM (Prototype/Proof-of-Concept)☆68Updated last year
- Rust bindings for CTranslate2☆14Updated last year
- ez audio transcription tool with flexible processing and post-processing options☆149Updated last year
- ☆39Updated last year
- Running the F5-TTS by ONNX Runtime standalone with GUI☆18Updated 5 months ago
- Using OpenVINO to speed up MeloTTS inference☆11Updated 6 months ago