espnet/espnet_model_zoo

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/espnet/espnet_model_zoo)

espnet / espnet_model_zoo

ESPnet Model Zoo

☆260

Alternatives and similar repositories for espnet_model_zoo

Users that are interested in espnet_model_zoo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

espnet / espnet
View on GitHub
End-to-End Speech Processing Toolkit
☆9,901Updated this week
lhotse-speech / lhotse
View on GitHub
Tools for handling multimodal data in machine learning projects.
☆1,143Jun 22, 2026Updated last month
SpeechColab / GigaSpeech
View on GitHub
Large, modern dataset for speech recognition
☆731Feb 26, 2024Updated 2 years ago
m-wiesner / nnet_pytorch
View on GitHub
Kaldi style neural network training in pytorch for use in place of nnet3 in Kaldi.
☆26Jul 25, 2024Updated 2 years ago
gpu-poor / gramvaani_hindi_asr
View on GitHub
This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge
☆16Mar 26, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
espnet / espnet_onnx
View on GitHub
Onnx wrapper for espnet infrernce model
☆169Aug 11, 2025Updated 11 months ago
tencent-ailab / pika
View on GitHub
a lightweight speech processing toolkit based on Pytorch and (Py)Kaldi
☆354Dec 25, 2020Updated 5 years ago
cywang97 / StreamingTransformer
View on GitHub
☆277Jan 15, 2021Updated 5 years ago
k2-fsa / fast_rnnt
View on GitHub
A torch implementation of a recursion which turns out to be useful for RNN-T.
☆149Aug 25, 2023Updated 2 years ago
k2-fsa / k2
View on GitHub
FSA/FST algorithms, differentiable, with PyTorch compatibility.
☆1,348Jul 11, 2026Updated 2 weeks ago
groadabike / Kaldi-Dsing-task
View on GitHub
DSing ASR task: Resources and Baseline for an unaccompanied singing ASR.
☆19Jul 9, 2026Updated 2 weeks ago
guozixunnicolas / DENT_DDSP
View on GitHub
☆24Jun 30, 2023Updated 3 years ago
k2-fsa / kaldifst
View on GitHub
Python wrapper for OpenFST and its extensions from Kaldi. Also support reading/writing ark/scp files
☆56Apr 9, 2026Updated 3 months ago
espnet / interspeech2019-tutorial
View on GitHub
INTERSPEECH 2019 Tutorial Materials
☆194Mar 30, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
cyfer0618 / kaldi-pytorch-rnnlm
View on GitHub
Enable RNNLM lattice rescoring with Pytorch [kaldi]
☆12Jun 5, 2020Updated 6 years ago
s3prl / s3prl
View on GitHub
Self-Supervised Speech Pre-training and Representation Learning Toolkit
☆2,557Mar 12, 2026Updated 4 months ago
csukuangfj / kaldilm
View on GitHub
Python wrapper for kaldi's arpa2fst
☆38Aug 27, 2025Updated 10 months ago
kan-bayashi / ParallelWaveGAN
View on GitHub
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
☆1,646Apr 22, 2024Updated 2 years ago
kate-egorova / ASR-hybrid-decoding
View on GitHub
This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…
☆11Feb 4, 2020Updated 6 years ago
espnet / espnet_tts_frontend
View on GitHub
Text frontend for ESPnet tts recipes
☆35Jun 1, 2021Updated 5 years ago
nvidia-riva / riva-asrlib-decoder
View on GitHub
Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva
☆91Feb 18, 2025Updated last year
HuangZiliAndy / SSL_for_multitalker
View on GitHub
ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER SPEECH RECOGNITION USING SPEAKER EMBEDDINGS
☆33Mar 16, 2023Updated 3 years ago
wavlab-speech / cmu_multilingual_speech
View on GitHub
CMU multilingual speech repository
☆30Apr 15, 2022Updated 4 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
luomingshuang / k2-speechbrain
View on GitHub
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Jun 17, 2022Updated 4 years ago
hirofumi0810 / neural_sp
View on GitHub
End-to-end ASR/LM implementation with PyTorch
☆594Aug 30, 2021Updated 4 years ago
pzelasko / kaldialign
View on GitHub
Python wrappers for Kaldi Levenshtein's distance and alignment code.
☆70Jun 15, 2026Updated last month
csukuangfj / kaldifeat
View on GitHub
Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - P…
☆215Jul 10, 2026Updated 2 weeks ago
facebookresearch / WavAugment
View on GitHub
A library for speech data augmentation in time-domain
☆689Aug 30, 2021Updated 4 years ago
wavlab-speech / shinjiwlab.github.io
View on GitHub
☆18Updated this week
funcwj / setk
View on GitHub
Tools for Speech Enhancement integrated with Kaldi
☆432Jul 6, 2023Updated 3 years ago
microsoft / UniSpeech
View on GitHub
UniSpeech - Large Scale Self-Supervised Learning for Speech
☆486Apr 5, 2024Updated 2 years ago
cornerfarmer / ctc_segmentation
View on GitHub
Segment a given audio into utterances using a trained end-to-end ASR model.
☆75Oct 9, 2020Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
hitachi-speech / EEND
View on GitHub
End-to-End Neural Diarization
☆435Aug 30, 2021Updated 4 years ago
lumaku / ctc-segmentation
View on GitHub
Segment an audio file and obtain utterance alignments. (Python package)
☆348May 15, 2024Updated 2 years ago
jzlianglu / pykaldi2
View on GitHub
Yet another speech toolkit based on Kaldi and PyTorch
☆173Jul 1, 2020Updated 6 years ago
Kyubyong / g2p
View on GitHub
g2p: English Grapheme To Phoneme Conversion
☆926Jan 5, 2023Updated 3 years ago
manojpamk / pytorch_xvectors
View on GitHub
Deep speaker embeddings in PyTorch, including x-vectors. Code used in this work: https://arxiv.org/abs/2007.16196
☆321Nov 11, 2020Updated 5 years ago
jindongwang / EasyEspnet
View on GitHub
Making Espnet easier to use
☆54Apr 9, 2021Updated 5 years ago
wenet-e2e / speech-recognition-papers
View on GitHub
Towards hot directions in industrial end to end speech recognition
☆329Nov 30, 2021Updated 4 years ago