jianfch/stable-ts

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jianfch/stable-ts)

jianfch / stable-ts

Transcription, forced alignment, and audio indexing with OpenAI's Whisper

☆2,277

Alternatives and similar repositories for stable-ts

Users that are interested in stable-ts are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

linto-ai / whisper-timestamped
View on GitHub
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
☆2,829Sep 9, 2025Updated 10 months ago
m-bain / whisperX
View on GitHub
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆23,229Jul 13, 2026Updated last week
johnafish / whisperer
View on GitHub
generate granular word-level captions in srt format
☆58Sep 26, 2022Updated 3 years ago
SYSTRAN / faster-whisper
View on GitHub
Faster Whisper transcription with CTranslate2
☆24,503Nov 19, 2025Updated 8 months ago
pyannote / pyannote-audio
View on GitHub
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…
☆10,327Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
snakers4 / silero-vad
View on GitHub
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
☆9,661Jul 16, 2026Updated last week
EtienneAb3d / WhisperHallu
View on GitHub
Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
☆350Nov 12, 2024Updated last year
MahmoudAshraf97 / whisper-diarization
View on GitHub
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
☆5,608Feb 23, 2026Updated 5 months ago
EtienneAb3d / WhisperTimeSync
View on GitHub
Synchronize Whisper's timestamps over an existing accurate transcription
☆165May 28, 2024Updated 2 years ago
huggingface / distil-whisper
View on GitHub
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
☆4,097Jan 8, 2025Updated last year
nyrahealth / CrisperWhisper
View on GitHub
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
☆994Updated this week
MahmoudAshraf97 / ctc-forced-aligner
View on GitHub
Text to speech alignment using CTC forced alignment
☆526Jul 12, 2026Updated last week
mirix / approaches-to-diarisation
View on GitHub
A testing repo to share code and thoughts on diarisation
☆58Mar 26, 2024Updated 2 years ago
openai / whisper
View on GitHub
Robust Speech Recognition via Large-Scale Weak Supervision
☆105,506Apr 15, 2026Updated 3 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
ggml-org / whisper.cpp
View on GitHub
Port of OpenAI's Whisper model in C/C++
☆52,259Jul 11, 2026Updated last week
WhisperSpeech / WhisperSpeech
View on GitHub
An Open Source text-to-speech system built by inverting Whisper.
☆4,624Dec 14, 2025Updated 7 months ago
sanchit-gandhi / whisper-jax
View on GitHub
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
☆4,685Apr 3, 2024Updated 2 years ago
miguelvalente / whisperer
View on GitHub
Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.
☆137Aug 14, 2023Updated 2 years ago
Softcatala / whisper-ctranslate2
View on GitHub
Whisper command line client compatible with original OpenAI client based on CTranslate2.
☆1,332Feb 14, 2026Updated 5 months ago
yl4579 / StyleTTS2
View on GitHub
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
☆6,316Aug 10, 2024Updated last year
ufal / whisper_streaming
View on GitHub
Whisper realtime streaming for long speech-to-text transcription and translation
☆3,653Nov 12, 2025Updated 8 months ago
Vaibhavs10 / insanely-fast-whisper
View on GitHub
☆12,995Oct 25, 2025Updated 9 months ago
NVIDIA / BigVGAN
View on GitHub
Official PyTorch implementation of BigVGAN (ICLR 2023)
☆1,227Sep 5, 2024Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
jumon / whisper-punctuator
View on GitHub
Zero-shot multimodal punctuation insertion and truecasing using Whisper
☆120Feb 4, 2023Updated 3 years ago
resemble-ai / resemble-enhance
View on GitHub
AI powered speech denoising and enhancement
☆2,371Dec 3, 2024Updated last year
lhotse-speech / lhotse
View on GitHub
Tools for handling multimodal data in machine learning projects.
☆1,143Jun 22, 2026Updated last month
MontrealCorpusTools / Montreal-Forced-Aligner
View on GitHub
Command line utility for forced alignment using Kaldi
☆1,852Jul 11, 2026Updated last week
OpenNMT / CTranslate2
View on GitHub
Fast inference engine for Transformer models
☆4,585Jul 3, 2026Updated 3 weeks ago
yl4579 / PL-BERT
View on GitHub
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions
☆270Jan 13, 2025Updated last year
coqui-ai / TTS
View on GitHub
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
☆45,801Aug 16, 2024Updated last year
gemelo-ai / vocos
View on GitHub
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
☆1,143Aug 7, 2024Updated last year
juanmc2005 / diart
View on GitHub
A python package to build AI-powered real-time audio applications
☆2,005Jun 19, 2026Updated last month
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
baxtree / subaligner
View on GitHub
Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Tran…
☆509Jul 13, 2026Updated last week
KdaiP / StableTTS
View on GitHub
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
☆437Sep 13, 2024Updated last year
shirayu / whispering
View on GitHub
Streaming transcriber with whisper
☆696May 1, 2023Updated 3 years ago
facebookresearch / seamless_communication
View on GitHub
Foundational Models for State-of-the-Art Speech and Text Translation
☆11,819Apr 8, 2026Updated 3 months ago
haoheliu / voicefixer
View on GitHub
General Speech Restoration
☆1,355Feb 17, 2025Updated last year
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
speechbrain / speechbrain
View on GitHub
A PyTorch-based Speech Toolkit
☆11,711Jun 15, 2026Updated last month