huggingface/distil-whisper

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/huggingface/distil-whisper)

huggingface / distil-whisper

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

☆4,091

Alternatives and similar repositories for distil-whisper

Users that are interested in distil-whisper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SYSTRAN / faster-whisper
View on GitHub
Faster Whisper transcription with CTranslate2
☆24,400Nov 19, 2025Updated 8 months ago
m-bain / whisperX
View on GitHub
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆23,143Jul 13, 2026Updated last week
Vaibhavs10 / insanely-fast-whisper
View on GitHub
☆12,988Oct 25, 2025Updated 8 months ago
sanchit-gandhi / whisper-jax
View on GitHub
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
☆4,685Apr 3, 2024Updated 2 years ago
yl4579 / StyleTTS2
View on GitHub
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
☆6,312Aug 10, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
WhisperSpeech / WhisperSpeech
View on GitHub
An Open Source text-to-speech system built by inverting Whisper.
☆4,625Dec 14, 2025Updated 7 months ago
facebookresearch / seamless_communication
View on GitHub
Foundational Models for State-of-the-Art Speech and Text Translation
☆11,816Apr 8, 2026Updated 3 months ago
huggingface / parler-tts
View on GitHub
Inference and training library for high-quality TTS models.
☆5,582Dec 10, 2024Updated last year
pyannote / pyannote-audio
View on GitHub
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…
☆10,302Updated this week
ggml-org / whisper.cpp
View on GitHub
Port of OpenAI's Whisper model in C/C++
☆51,922Jul 11, 2026Updated last week
FL33TW00D / whisper-turbo
View on GitHub
Cross-Platform, GPU Accelerated Whisper 🏎️
☆1,790Feb 27, 2024Updated 2 years ago
OpenNMT / CTranslate2
View on GitHub
Fast inference engine for Transformer models
☆4,577Jul 3, 2026Updated 2 weeks ago
snakers4 / silero-vad
View on GitHub
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
☆9,631Updated this week
coqui-ai / TTS
View on GitHub
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
☆45,783Aug 16, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
openai / whisper
View on GitHub
Robust Speech Recognition via Large-Scale Weak Supervision
☆105,288Apr 15, 2026Updated 3 months ago
kyutai-labs / moshi
View on GitHub
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audi…
☆10,636May 16, 2026Updated 2 months ago
metavoiceio / metavoice-src
View on GitHub
Foundational model for human-like, expressive TTS
☆4,203Jul 30, 2024Updated last year
suno-ai / bark
View on GitHub
🔊 Text-Prompted Generative Audio Model
☆39,201Aug 19, 2024Updated last year
ufal / whisper_streaming
View on GitHub
Whisper realtime streaming for long speech-to-text transcription and translation
☆3,652Nov 12, 2025Updated 8 months ago
collabora / WhisperLive
View on GitHub
A nearly-live implementation of OpenAI's Whisper.
☆4,141Updated this week
open-mmlab / Amphion
View on GitHub
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junio…
☆9,957Mar 25, 2026Updated 3 months ago
myshell-ai / OpenVoice
View on GitHub
Instant voice cloning by MIT and MyShell. Audio foundation model.
☆36,984Apr 19, 2025Updated last year
mit-han-lab / streaming-llm
View on GitHub
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
☆7,248Jul 11, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
jasonppy / VoiceCraft
View on GitHub
Zero-Shot Speech Editing and Text-to-Speech in the Wild
☆8,495May 30, 2026Updated last month
MahmoudAshraf97 / whisper-diarization
View on GitHub
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
☆5,601Feb 23, 2026Updated 4 months ago
collabora / WhisperFusion
View on GitHub
WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
☆1,646Jul 31, 2024Updated last year
NVIDIA-NeMo / Speech
View on GitHub
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Auto…
☆17,794Updated this week
facebookresearch / audiocraft
View on GitHub
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…
☆23,505Mar 3, 2026Updated 4 months ago
speechbrain / speechbrain
View on GitHub
A PyTorch-based Speech Toolkit
☆11,697Jun 15, 2026Updated last month
linto-ai / whisper-timestamped
View on GitHub
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
☆2,825Sep 9, 2025Updated 10 months ago
Vaibhavs10 / open-tts-tracker
View on GitHub
☆1,147Feb 13, 2025Updated last year
fishaudio / fish-speech
View on GitHub
SOTA Open Source TTS
☆31,332Jun 9, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
netease-youdao / EmotiVoice
View on GitHub
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
☆8,491Aug 13, 2024Updated last year
espnet / espnet
View on GitHub
End-to-End Speech Processing Toolkit
☆9,897Updated this week
neonbjb / tortoise-tts
View on GitHub
A multi-voice TTS system trained with an emphasis on quality
☆14,863Nov 19, 2024Updated last year
datalab-to / surya
View on GitHub
OCR, layout analysis, reading order, table recognition in 90+ languages
☆21,123Updated this week
haotian-liu / LLaVA
View on GitHub
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
☆24,930Aug 12, 2024Updated last year
aiola-lab / whisper-medusa
View on GitHub
Whisper with Medusa heads
☆860Jul 2, 2026Updated 2 weeks ago
Lightning-AI / litgpt
View on GitHub
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
☆13,491Updated this week