nkilm/offline-whisperx

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nkilm/offline-whisperx)

nkilm / offline-whisperx

Run different pipelines of WhisperX - Transcription, Diarization, VAD, Alignment completely OFFLINE.

☆48

Alternatives and similar repositories for offline-whisperx

Users that are interested in offline-whisperx are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Mastering-Python-GT / Transcription-diarization-whisper-pyannote
View on GitHub
Transcription and diarization (speaker identification)
☆33May 31, 2023Updated 3 years ago
tijszwinkels / whisperX-api
View on GitHub
The WhisperX API is a containerized solution for transcribing audio files using the powerful `whisperx` model. This API provides an easy-…
☆18Aug 24, 2023Updated 2 years ago
thomasvvugt / whisperx
View on GitHub
Docker image for WhisperX by Max Bain
☆13Sep 24, 2025Updated 10 months ago
mrseanryan / gpt-summarizer
View on GitHub
Summarize (and translate) text using ChatGPT or a local LLM, with support for multiple large text files, PDF files. Preserves original st…
☆20Feb 14, 2026Updated 5 months ago
CrispStrobe / Susurrus
View on GitHub
speech to text gui for different (e.g. Whisper, Voxtral) models and backends, including whisper.cpp, crispasar, mlx-whisper, faster-whisp…
☆27Updated this week
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
OlivierAlbertini / Voxtral-WebUI
View on GitHub
A Web UI for easy subtitle using various models including voxtral
☆26Jul 22, 2025Updated last year
Fcabla / whisper_subtitler
View on GitHub
Generate transcriptions and subtitles using OpenAI whisper as a base model, stable-ts/whisperx as a timestamp stabilizer using ASR models…
☆19Mar 10, 2023Updated 3 years ago
lukeewin / FunASR_API
View on GitHub
这是基于FunASR实现的区分说话人语音识别API | This is a speaker-diarization-based speech recognition API implemented using FunASR.
☆27Jun 16, 2026Updated last month
lovemefan / paraformer-python
View on GitHub
paraformer(chinense asr) online onnx runtime for python
☆54Mar 27, 2024Updated 2 years ago
Nik-Kras / Live_ASR_Whisper_Gradio
View on GitHub
Real Time Speech To Text with corrections powered by Gradio
☆17Jan 13, 2025Updated last year
pengzhendong / streaming-asr
View on GitHub
One command to start a streaming ASR server.
☆12Oct 2, 2024Updated last year
lovemefan / campplus
View on GitHub
A open-source toolkit for single and multi-modal speaker verification from modelscope and funasr with onnx
☆15Dec 16, 2023Updated 2 years ago
StanfordScreenomics / Platform
View on GitHub
The Stanford Screenomics is an open-source Android app framework for capturing real-time digital trace data to support behavioral and hea…
☆25Apr 6, 2026Updated 3 months ago
Deep-unlearning / Finetune-Parakeet
View on GitHub
☆25Oct 22, 2025Updated 9 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Sergey004 / silero_tts_rvc
View on GitHub
A simple extension that allows LLM to speak in any voice, literally, based on Sliero TTS which is available in oobabooga's textgen-webui …
☆12Aug 26, 2023Updated 2 years ago
Bartelds / ctc-dro
View on GitHub
Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.
☆17May 16, 2025Updated last year
jundaychan / funasr-fastapi
View on GitHub
funasr语音转文字的简单api版本，funasr+fastapi，方便部署在服务器上
☆13Aug 10, 2024Updated last year
guxm2021 / SVT_SpeechBrain
View on GitHub
[TOMM 2024] Automatic Lyric Transcription and Automatic Music Transcription from Multimodal Singing
☆28Aug 30, 2024Updated last year
hedrergudene / asr-sd-pipeline
View on GitHub
Speech recognition & diarisation solution with text alignment, deployed in AML pipelines
☆102May 7, 2024Updated 2 years ago
dwsjoan / SRAS
View on GitHub
Speech Recognition and Simple AI Summary：可用于本地语音转文字、说话人分割及简易的AI总结，搭配web端操作界面。
☆11Jul 22, 2024Updated 2 years ago
RemSynch / SenseVoice-Real-Time
View on GitHub
简单实现VAD+声纹锁+SenseVoice完成类语音实时转录的小项目
☆42Sep 23, 2024Updated last year
YanZiBuGuiCHunShiWan / RESTFUL_ASR
View on GitHub
基于wenet的短时在线语音识别服务
☆11Feb 25, 2023Updated 3 years ago
allseeteam / whisperx-fastapi
View on GitHub
WhisperX FastAPI integration
☆18Mar 31, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
IS2AI / MultilingualASR
View on GitHub
☆14Aug 9, 2021Updated 4 years ago
Gelelmaster / Funasr-Qwen-GPTSovits
View on GitHub
<综合> Funasr语音识别，调用Qwen大模型回答，通过GPTSovits输出语音的ai程序，其中调用模型还是在线，后续将添加离线大模型
☆13Nov 30, 2024Updated last year
lukeewin / ASR_LLM_TTS_Front
View on GitHub
ASR_LLM_TTS前端项目
☆15Dec 3, 2024Updated last year
ictnlp / LSG
View on GitHub
The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”
☆15Jan 3, 2025Updated last year
NTIA / alignnet
View on GitHub
Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.
☆18Aug 1, 2025Updated 11 months ago
kaihuhuang / Language-Group
View on GitHub
☆11Dec 24, 2024Updated last year
Miamoto / Conformer-NTM
View on GitHub
☆16Nov 9, 2023Updated 2 years ago
shuaijiang / Whisper-Finetune
View on GitHub
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training wit…
☆318Dec 22, 2025Updated 7 months ago
xingchensong / Speech-Transformer-plus-2DAttention
View on GitHub
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
☆12May 7, 2019Updated 7 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ysngki / XMoE
View on GitHub
☆15Oct 19, 2024Updated last year
skjerns / sleep-utils
View on GitHub
A collections of tools around sleep research: plotting of hypnograms / spectrograms, etc etc
☆10Jan 24, 2026Updated 6 months ago
MorenoLaQuatra / vad
View on GitHub
Simple voice activity detection (VAD) algorithm in Python
☆15Aug 10, 2023Updated 2 years ago
fjmatrix / gpt_assistant
View on GitHub
Web app designed to enhance your interaction with OpenAI's language models
☆12Jun 14, 2023Updated 3 years ago
WalkerMitty / Fast-Llama2
View on GitHub
Fast instruction tuning with Llama2
☆11Apr 8, 2024Updated 2 years ago
yjg30737 / pyqt-stable-diffusion-gui
View on GitHub
PyQt(+PySide) Stable Diffusion GUI
☆15Aug 1, 2023Updated 2 years ago
Maryam-Nasseri / CrewAI-Local-Agents-1
View on GitHub
Local LLM set-up
☆18Jul 1, 2024Updated 2 years ago