shashikg / WhisperS2TLinks
An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine
β436Updated 10 months ago
Alternatives and similar repositories for WhisperS2T
Users that are interested in WhisperS2T are comparing it to the libraries listed below
Sorting:
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated textsβ331Updated 8 months ago
- π¬ ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.β214Updated 8 months ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detectionβ769Updated last month
- Python bindings for whisper.cppβ275Updated last week
- speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts withβ¦β220Updated 3 months ago
- Minimal extension of OpenAI's Whisper adding speaker diarization with special tokensβ502Updated last year
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.β317Updated 2 years ago
- β526Updated last year
- Whisper with Medusa headsβ849Updated this week
- β239Updated 3 weeks ago
- Efficient approach to speaker diarization using voice characteristics extractionβ97Updated 3 weeks ago
- A python package to build AI-powered real-time audio applicationsβ1,356Updated 5 months ago
- openvino version of openai/whisperβ168Updated last year
- Joint speech-language model - respond directly to audio!β371Updated last year
- β300Updated last year
- G2Pβ272Updated 2 months ago
- β606Updated last year
- Improving transcription performance of OpenAI Whisper for CPU based deploymentβ246Updated 2 years ago
- Pybind11 bindings for Whisper.cppβ333Updated 7 months ago
- Whisper realtime streaming for long speech-to-text transcription and translationβ120Updated last year
- Cog implementation of transcribing + diarization pipeline with Whisper & Pyannoteβ215Updated 4 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelinesβ96Updated last year
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event β¦β395Updated last year
- Text to speech alignment using CTC forced alignmentβ311Updated 3 months ago
- π π€ Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloningβ160Updated 11 months ago
- How to use OpenAIs Whisper to transcribe and diarize audio filesβ346Updated 2 years ago
- β359Updated last year
- Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker Diarization (Dockerfile, CI image build and β¦β320Updated last week
- Batch Support for OpenAI Whisperβ94Updated last year
- Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JSβ889Updated 9 months ago