mobiusml / faster-whisper
Faster Whisper ASR transcription with CTranslate2
☆19Updated 2 months ago
Alternatives and similar repositories for faster-whisper:
Users that are interested in faster-whisper are comparing it to the libraries listed below
- Speaker diarization service☆20Updated 3 weeks ago
- StyleTTS 2 Optimized Training Fork☆15Updated this week
- Audio tokenization, in the fastest way possible!☆46Updated 4 months ago
- Python package of MP-SENet from Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.☆11Updated 2 months ago
- G2P☆20Updated this week
- Easy tool that splits given audio based on speaker.☆11Updated last year
- Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.☆25Updated 2 years ago
- GGML implementation of BERT model with Python bindings and quantization.☆52Updated 11 months ago
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆24Updated 3 months ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆90Updated 3 months ago
- ForceAlign is a Python library for forced alignment of English text to English audio. You can use ForceAlign to get word or phoneme level…☆10Updated last month
- WarpRNNT loss ported in Numba CPU/CUDA for Pytorch☆16Updated 2 years ago
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆19Updated 4 months ago
- Experiments with BitNet inference on CPU☆52Updated 9 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆53Updated last month
- Supervoice diffusion enhance☆26Updated 6 months ago
- A lightweight Python library for running TTS models with a unified API.☆13Updated last week
- Trying to build an all in one speech-text language model - a bit like GPT-4o☆22Updated 7 months ago
- ☆9Updated 3 months ago
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.☆19Updated 2 months ago
- Convert your PDFs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and efficient…☆33Updated last week
- Tunable pipelines☆31Updated this week
- Joint speech-language model - respond directly to audio!☆30Updated 8 months ago
- Speakerbox: Fine-tune Audio Transformers for speaker identification.☆53Updated last month
- Whisper combined with Silero VAD, for improved long-form transcriptions☆45Updated 2 years ago
- ☆9Updated last week
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Updated last year
- ☆50Updated 3 weeks ago
- Simple PyTorch Denoisers for Waveform Audio☆34Updated last month