linto-ai/whisper-timestamped

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/linto-ai/whisper-timestamped)

linto-ai / whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

☆2,831

Alternatives and similar repositories for whisper-timestamped

Users that are interested in whisper-timestamped are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jianfch / stable-ts
View on GitHub
Transcription, forced alignment, and audio indexing with OpenAI's Whisper
☆2,283May 30, 2026Updated last month
m-bain / whisperX
View on GitHub
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆23,304Jul 13, 2026Updated 2 weeks ago
SYSTRAN / faster-whisper
View on GitHub
Faster Whisper transcription with CTranslate2
☆24,595Nov 19, 2025Updated 8 months ago
pyannote / pyannote-audio
View on GitHub
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…
☆10,340Updated this week
MahmoudAshraf97 / whisper-diarization
View on GitHub
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
☆5,611Feb 23, 2026Updated 5 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
nyrahealth / CrisperWhisper
View on GitHub
Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection
☆1,021Updated this week
ufal / whisper_streaming
View on GitHub
Whisper realtime streaming for long speech-to-text transcription and translation
☆3,657Nov 12, 2025Updated 8 months ago
snakers4 / silero-vad
View on GitHub
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
☆9,764Jul 16, 2026Updated last week
juanmc2005 / diart
View on GitHub
A python package to build AI-powered real-time audio applications
☆2,006Jun 19, 2026Updated last month
huggingface / distil-whisper
View on GitHub
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
☆4,099Jan 8, 2025Updated last year
EtienneAb3d / WhisperHallu
View on GitHub
Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
☆351Nov 12, 2024Updated last year
yl4579 / StyleTTS2
View on GitHub
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
☆6,320Aug 10, 2024Updated last year
speechbrain / speechbrain
View on GitHub
A PyTorch-based Speech Toolkit
☆11,717Jun 15, 2026Updated last month
WhisperSpeech / WhisperSpeech
View on GitHub
An Open Source text-to-speech system built by inverting Whisper.
☆4,625Dec 14, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
openai / whisper
View on GitHub
Robust Speech Recognition via Large-Scale Weak Supervision
☆105,839Apr 15, 2026Updated 3 months ago
ggml-org / whisper.cpp
View on GitHub
Port of OpenAI's Whisper model in C/C++
☆52,364Jul 11, 2026Updated 2 weeks ago
sanchit-gandhi / whisper-jax
View on GitHub
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
☆4,686Apr 3, 2024Updated 2 years ago
Vaibhavs10 / insanely-fast-whisper
View on GitHub
☆12,997Oct 25, 2025Updated 9 months ago
collabora / WhisperLive
View on GitHub
A nearly-live implementation of OpenAI's Whisper.
☆4,190Updated this week
coqui-ai / TTS
View on GitHub
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
☆45,827Aug 16, 2024Updated last year
MontrealCorpusTools / Montreal-Forced-Aligner
View on GitHub
Command line utility for forced alignment using Kaldi
☆1,855Jul 11, 2026Updated 2 weeks ago
davabase / whisper_real_time
View on GitHub
Real time transcription with OpenAI Whisper.
☆2,939Apr 15, 2025Updated last year
Softcatala / whisper-ctranslate2
View on GitHub
Whisper command line client compatible with original OpenAI client based on CTranslate2.
☆1,333Feb 14, 2026Updated 5 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
YuanGongND / whisper-at
View on GitHub
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …
☆422Feb 21, 2024Updated 2 years ago
bootphon / phonemizer
View on GitHub
Simple text to phones converter for multiple languages
☆1,558Sep 26, 2024Updated last year
gemelo-ai / vocos
View on GitHub
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
☆1,146Aug 7, 2024Updated last year
EtienneAb3d / WhisperTimeSync
View on GitHub
Synchronize Whisper's timestamps over an existing accurate transcription
☆165May 28, 2024Updated 2 years ago
facebookresearch / seamless_communication
View on GitHub
Foundational Models for State-of-the-Art Speech and Text Translation
☆11,826Updated this week
wq2012 / awesome-diarization
View on GitHub
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
☆1,886Jul 7, 2026Updated 3 weeks ago
neonbjb / tortoise-tts
View on GitHub
A multi-voice TTS system trained with an emphasis on quality
☆14,865Nov 19, 2024Updated last year
shivammehta25 / Matcha-TTS
View on GitHub
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
☆1,337Jul 13, 2026Updated 2 weeks ago
OpenNMT / CTranslate2
View on GitHub
Fast inference engine for Transformer models
☆4,594Jul 3, 2026Updated 3 weeks ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
NVIDIA-NeMo / Speech
View on GitHub
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Auto…
☆17,829Updated this week
vasistalodagala / whisper-finetune
View on GitHub
Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.
☆365May 23, 2023Updated 3 years ago
lhotse-speech / lhotse
View on GitHub
Tools for handling multimodal data in machine learning projects.
☆1,143Jun 22, 2026Updated last month
huggingface / parler-tts
View on GitHub
Inference and training library for high-quality TTS models.
☆5,580Dec 10, 2024Updated last year
espnet / espnet
View on GitHub
End-to-End Speech Processing Toolkit
☆9,903Updated this week
suno-ai / bark
View on GitHub
🔊 Text-Prompted Generative Audio Model
☆39,214Aug 19, 2024Updated last year
Vaibhavs10 / fast-whisper-finetuning
View on GitHub
☆562Jul 10, 2024Updated 2 years ago