gumblex / whisper_vad
Whisper.cpp Speech-to-text with Voice Acticity Detection
☆14Updated 3 months ago
Alternatives and similar repositories for whisper_vad:
Users that are interested in whisper_vad are comparing it to the libraries listed below
- Low Complexity Communication Codec Plus (mirror)☆11Updated 7 months ago
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆9Updated 7 months ago
- ez audio transcription tool with flexible processing and post-processing options☆144Updated last year
- GFPGAN face reconstruction with ncnn on a bare Raspberry Pi☆12Updated 2 years ago
- Port of Funasr's Paraformer model in C/C++☆28Updated 8 months ago
- Generate subtitles for long movies / podcasts with OpenAI Whisper API.☆27Updated last year
- A lightweight pure C++ Text-to-Speech (TTS) pipeline with OpenVINO, supporting multiple languages.☆39Updated this week
- Inference TinyLlama models on ncnn☆24Updated last year
- ASR using OpenAI capability API `v1/audio/transcriptions` like Groq, SiliconFlow☆27Updated 5 months ago
- Experiments to test different speech recognition systems for SEPIA Framework☆58Updated last year
- Scripts and tools for optimizing quantizations in llama.cpp with GGUF imatrices.☆14Updated last month
- ONNX and TensorRT implementation of Whisper☆61Updated last year
- XCORE-VOICE Solution☆12Updated last week
- A chat UI for Llama.cpp☆12Updated last week
- Speaker diarization service☆21Updated this week
- Transcription and annotation interface for recorded audio or video files☆31Updated this week
- My develoopment fork of llama.cpp. For now working on RK3588 NPU and Tenstorrent backend☆84Updated 2 weeks ago
- Brand new TTS solution☆9Updated 2 months ago
- Simple, energy-based voice activity detection algorithm implementation.☆17Updated 10 months ago
- Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.☆40Updated this week
- A cross platform implementation of Text-to-Speech based on ONNXRuntime.☆32Updated last year
- Faster Whisper ASR transcription with CTranslate2☆19Updated 3 months ago
- A transformer-based multimodal model for music.☆28Updated 6 months ago
- ☆11Updated 3 years ago
- Russian phonetical transcription☆9Updated last year
- Baresip Applications Modules☆15Updated last week
- Tiny wrapper around webrtc-audio-processing for noise suppression/auto gain only☆18Updated 7 months ago
- Identify speakers with stable voice timbre.☆28Updated 8 months ago
- A enterprise-grade Voice Activity Detector from modelscope and funasr.☆76Updated last year