hitz-zentroa/whisper-lm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hitz-zentroa/whisper-lm)

hitz-zentroa / whisper-lm

Add n-gram and large language model (LLM) support to Whisper models.

☆43

Alternatives and similar repositories for whisper-lm

Users that are interested in whisper-lm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TehreemFarooqi / Preparing-a-speech-recognition-dataset-using-YouTube-videos
View on GitHub
Using YouTube to prepare a speech recognition dataset for any language
☆10Mar 30, 2021Updated 5 years ago
jhuang448 / MultilingualALT
View on GitHub
Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""
☆15Jun 28, 2024Updated 2 years ago
aholab / AhoTTS
View on GitHub
Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its…
☆18Jan 15, 2026Updated 6 months ago
naver / multilingual-distilwhisper
View on GitHub
This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.
☆34Apr 22, 2026Updated 3 months ago
bigcash / awesome-vad
View on GitHub
A curated list of awesome voice activity detection
☆75Nov 22, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
llm-lab-org / CLASP
View on GitHub
CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval
☆13Jun 27, 2025Updated last year
tuanio / conformer-rnnt
View on GitHub
Conformer RNN-Transducer
☆14May 25, 2022Updated 4 years ago
OlaWod / PitchVC
View on GitHub
PitchVC: Pitch Conditioned Any-to-Many Voice Conversion
☆35Jun 6, 2024Updated 2 years ago
zhu-han / SpeechLLM
View on GitHub
LLM-based ASR recipe with Zipformer encoder and Qwen LLM
☆35Sep 25, 2025Updated 10 months ago
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Updated this week
zzasdf / VietASR
View on GitHub
☆52Sep 3, 2025Updated 10 months ago
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
xinjli / alqalign
View on GitHub
multilingual speech aligner
☆78Nov 19, 2023Updated 2 years ago
iamanigeeit / present
View on GitHub
☆14Aug 19, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
jakariaemon / WSI
View on GitHub
Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.
☆26Jun 29, 2026Updated last month
VinAIResearch / PhoST
View on GitHub
A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation (INTERSPEECH 2022)
☆26Jun 5, 2025Updated last year
vocaliodmiku / wav2vec2mdd-Text
View on GitHub
☆19Jun 28, 2022Updated 4 years ago
dense-analysis / vim-speech
View on GitHub
Vim Speech Recognition Experiments
☆20May 30, 2025Updated last year
SpeechColab / PySpeechColab
View on GitHub
A library of speech gadgets.
☆15Oct 15, 2022Updated 3 years ago
ozspeech / OZSpeech
View on GitHub
[ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
☆45Feb 9, 2025Updated last year
uhh-lt / kaldi-model-server
View on GitHub
Simple Kaldi model server for chain (nnet3) models in online recognition mode directly from a local microphone
☆35Feb 18, 2022Updated 4 years ago
yucongzh / online_speaker_diarization
View on GitHub
☆15Jul 11, 2022Updated 4 years ago
artie-inc / artie-bias-corpus
View on GitHub
Artie Bias Corpus: an audio corpus + code for detecting demographic bias
☆20Jul 21, 2020Updated 6 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
wonjune-kang / expressive-speech-retrieval
View on GitHub
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
☆15Aug 18, 2025Updated 11 months ago
halsay / ASR-TTS-paper-daily
View on GitHub
Update ASR paper everyday
☆513May 16, 2026Updated 2 months ago
ryanrudes / YTTTS
View on GitHub
The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions
☆53Apr 1, 2021Updated 5 years ago
Ashigarg123 / ShiftySpeech
View on GitHub
☆15Jul 24, 2025Updated last year
desh2608 / kaldi-noise-vectors
View on GitHub
Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.
☆13Feb 13, 2021Updated 5 years ago
Picovoice / text-to-speech-benchmark
View on GitHub
Text-to-Speech Benchmark
☆26Apr 2, 2026Updated 3 months ago
ex3ndr / supervoice-enhance
View on GitHub
Supervoice diffusion enhance
☆28Jul 15, 2024Updated 2 years ago
domcross / german-stt-evaluation
View on GitHub
Evaluation of STT models for german language
☆16Jan 22, 2022Updated 4 years ago
AkshathRaghav / tinyspeech
View on GitHub
Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"
☆23Jun 7, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
HLasse / multidiagnosis-speech
View on GitHub
☆10Jun 23, 2023Updated 3 years ago
charlesliucn / LanMIT
View on GitHub
📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.
☆22Jul 12, 2019Updated 7 years ago
pengzhendong / ngram-punctuator
View on GitHub
An N-gram punctuator for Chinese and English.
☆18Oct 14, 2025Updated 9 months ago
lourson1091 / audiobertscore
View on GitHub
☆15Nov 10, 2025Updated 8 months ago
JRMeyer / multi-task-kaldi
View on GitHub
An example directory for running Multi-Task Learning training on Kaldi neural networks. In Kaldi-speak, this is an egs dir for nnet3 trai…
☆55Jan 2, 2020Updated 6 years ago
speechio / asr-noises
View on GitHub
A handy dataset of noises for ASR
☆22May 29, 2019Updated 7 years ago
corticph / error-align
View on GitHub
Text-to-text alignment algorithm for speech recognition error analysis.
☆32Jun 23, 2026Updated last month