jim60105 / docker-whisperX
Dockerfile for WhisperX: Automatic Speech Recognition with Word-Level Timestamps and Speaker Diarization (Dockerfile, CI image build and test)
β276Updated last week
Alternatives and similar repositories for docker-whisperX
Users that are interested in docker-whisperX are comparing it to the libraries listed below
Sorting:
- π¬ ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.β210Updated 6 months ago
- Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detectionβ705Updated 4 months ago
- FastAPI service on top of WhisperXβ95Updated this week
- speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts withβ¦β213Updated last month
- β1,809Updated last week
- Record audio and save a transcription to your system's clipboard with ctranslate2 and faster-whisper.β114Updated last week
- An API to transcribe audio with OpenAI's Whisper Large v3!β273Updated 6 months ago
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated textsβ325Updated 6 months ago
- β224Updated last month
- Open dubbing is an AI dubbing system which uses machine learning models to automatically translate and synchronize audio dialogue into diβ¦β218Updated 3 months ago
- High-performance Text-to-Speech server with OpenAI-compatible API, 8 voices, emotion tags, and modern web UI. Optimized for RTX GPUs.β353Updated 3 weeks ago
- Synchronize Whisper's timestamps over an existing accurate transcriptionβ148Updated 11 months ago
- An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engineβ413Updated 8 months ago
- The fastest Whisper optimization for automatic speech recognition as a command-line interface β‘οΈβ350Updated 11 months ago
- Private voice keyboard, AI chat, images, webcam, recordings, voice control with >= 4 GiB of VRAM.β230Updated this week
- β96Updated last year
- Listen to any audio stream on your machine and print out the transcribed or translated audio.β119Updated last year
- Cog implementation of transcribing + diarization pipeline with Whisper & Pyannoteβ210Updated 2 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelinesβ94Updated last year
- A User Interface for XTTS-2 Text-Based Voice Cloning using only 10 seconds of speechβ343Updated 5 months ago
- Live-Transcription (STT) with Whisper PoCβ181Updated 10 months ago
- API server for Instant voice cloning by MyShell.β92Updated 7 months ago
- Real time speech to text transcription app.β408Updated 2 years ago
- ez audio transcription tool with flexible processing and post-processing optionsβ149Updated last year
- Python bindings for whisper.cppβ247Updated last week
- Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JSβ865Updated 7 months ago
- β482Updated last year
- A python package to build AI-powered real-time audio applicationsβ1,285Updated 3 months ago
- A GUI tool for offline transcription of speech recordings, including speaker diarization, utilizing state-of-the-art machine learning modβ¦β549Updated last week
- Self-host the powerful Dia TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), supporβ¦β175Updated last week