DigitLib / whisper-webui-vad
This is the combined forks of two repos to enable OpenAI Whisper large image with VAD for low VRAM GPUs.
☆34Updated last year
Alternatives and similar repositories for whisper-webui-vad:
Users that are interested in whisper-webui-vad are comparing it to the libraries listed below
- Robust Speech Recognition via Large-Scale Weak Supervision☆30Updated last year
- ☆95Updated 10 months ago
- Record audio and save a transcription to your system's clipboard with ctranslate2 and faster-whisper.☆96Updated 3 weeks ago
- ☆48Updated last year
- ez audio transcription tool with flexible processing and post-processing options☆146Updated last year
- Whisper combined with Silero VAD, for improved long-form transcriptions☆47Updated 2 years ago
- A very simple implementation of edge_tts w/ RVC for oobabooga text-generation-webui.☆41Updated last year
- Listen to any audio stream on your machine and print out the transcribed or translated audio.☆117Updated last year
- Convert your PDFs and EPUBs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and…☆51Updated this week
- ☆58Updated 5 months ago
- RVC Inference with multiple model and huggingface support☆103Updated last year
- AI 3D avatar voice interface in browser. VAD -> STT -> LLM -> TTS -> VRM (Prototype/Proof-of-Concept)☆66Updated last year
- ☆82Updated 8 months ago
- web based editor for subtitles and transcripts☆123Updated 6 months ago
- ☆66Updated 4 months ago
- stable-diffusion.cpp bindings for python☆42Updated last week
- A browser interface based on the Gradio library for OpenAI's Whisper model.☆40Updated last year
- Running the F5-TTS by ONNX Runtime☆115Updated last week
- whisper.cpp bindings for python☆89Updated last year
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆94Updated 10 months ago
- Efficient approach to speaker diarization using voice characteristics extraction☆91Updated 10 months ago
- Made slight modifications to the Tortoise API, provided 3 additional scripts to make using Tortoise easier. Less focus on cloning makes s…☆52Updated 10 months ago
- Llama cute voice assistant☆27Updated last year
- ☆68Updated 11 months ago
- AI powered speech denoising and enhancement. Adapted for windows and optimized☆81Updated 8 months ago
- OpenAI Whisper API-style local server, runnig on FastAPI☆75Updated 3 months ago
- Simplified installers for suno-ai/bark, musicgen, tortoise, RVC, demucs and vocos☆45Updated 8 months ago
- A performant high-throughput CPU-based API for Meta's No Language Left Behind (NLLB) using CTranslate2, hosted on Hugging Face Spaces.☆103Updated this week
- Running the F5-TTS by ONNX Runtime standalone with GUI☆14Updated 3 months ago
- Gradio WebUI for whisper, faster-whisper, whisper-timestamped. Supports YouTube Downloader, Vocal Remover and Transcription.☆42Updated 3 months ago