Python bindings for whisper.cpp
☆327Feb 20, 2026Updated 2 weeks ago
Alternatives and similar repositories for pywhispercpp
Users that are interested in pywhispercpp are comparing it to the libraries listed below
Sorting:
- whisper.cpp bindings for python☆109Aug 24, 2023Updated 2 years ago
- Python bindings for whisper.cpp☆249Jun 1, 2024Updated last year
- Pybind11 bindings for Whisper.cpp☆343Dec 8, 2024Updated last year
- Offline Speaker Diarization with SenseVoice by Sherpa ONNX.☆15Dec 23, 2024Updated last year
- silero-vad pytorch implement☆36Nov 23, 2024Updated last year
- Crowdsourced and Automatic Speech Prominence Estimation☆25Apr 12, 2024Updated last year
- Silent Whisper inference for privacy and performance. Configured for GPU Spot Instances.☆11Sep 28, 2023Updated 2 years ago
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 8 months ago
- [EMNLP Main '25] LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation☆147May 18, 2025Updated 9 months ago
- Pybind11 bindings for Kaldi☆15Feb 1, 2026Updated last month
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆106Mar 30, 2025Updated 11 months ago
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆857Nov 16, 2024Updated last year
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆24Oct 8, 2025Updated 4 months ago
- A playground for experimenting with acoustic echo cancellation using a microphone, speaker, and ONNX.☆13Oct 22, 2024Updated last year
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated 11 months ago
- ☆11Sep 5, 2025Updated 6 months ago
- A simple command line tool to calculate WER for ASR.☆14Oct 14, 2024Updated last year
- Streaming Vocos☆30Jun 10, 2025Updated 8 months ago
- Pybind11 bindings for Whisper.cpp☆63Updated this week
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Mar 12, 2024Updated last year
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆14Nov 15, 2025Updated 3 months ago
- Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies☆15Nov 25, 2024Updated last year
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆69Nov 1, 2024Updated last year
- The implementation for "Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System".☆30Aug 2, 2025Updated 7 months ago
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆46Nov 6, 2023Updated 2 years ago
- Colab notebooks for Next-gen Kaldi☆30Oct 12, 2025Updated 4 months ago
- ☆19Jan 8, 2025Updated last year
- A composition of offline tools to achieve high quality multilingual speech to text transcription☆23Feb 2, 2026Updated last month
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated 11 months ago
- Faster Whisper transcription with CTranslate2☆21,289Nov 19, 2025Updated 3 months ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- TTS-Wrapper makes it easier to use text-to-speech APIs by providing a unified and easy-to-use interface.☆36Feb 20, 2026Updated 2 weeks ago
- A whisper <lib|cli|server> written in rust☆20Jan 3, 2026Updated 2 months ago
- Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation☆24Nov 12, 2025Updated 3 months ago
- A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.☆52May 22, 2025Updated 9 months ago
- Port of OpenAI's Whisper model in C/C++☆47,262Updated this week
- A Voice Activity Detector rust library using the Silero VAD model.☆62Aug 4, 2025Updated 7 months ago
- ☆21Jul 15, 2024Updated last year
- Github repository for ACL 2025 paper: VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models☆24Jun 16, 2025Updated 8 months ago