kmario23/KenLM-training

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kmario23/KenLM-training)

kmario23 / KenLM-training

Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2

☆116

Alternatives and similar repositories for KenLM-training

Users that are interested in KenLM-training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

patrickvonplaten / Wav2Vec2_PyCTCDecode
View on GitHub
Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode
☆110Aug 31, 2022Updated 3 years ago
kpu / kenlm
View on GitHub
KenLM: Faster and Smaller Language Model Queries
☆2,793Mar 30, 2025Updated last year
nelson-liu / flatten_gigaword
View on GitHub
Dump the text of the Gigaword dataset into a single file, for use with language modeling (and other!) toolkits
☆23Sep 23, 2017Updated 8 years ago
applicaai / pyramidions
View on GitHub
This repository contains a demonstrative implementation for pooling-based models, e.g., DeepPyramidion complementing our paper "Sparsifyi…
☆14May 15, 2022Updated 4 years ago
amazon-science / proteno
View on GitHub
This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…
☆45May 25, 2021Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
t13m / kaldi-readers-for-tensorflow
View on GitHub
readers that enable reading kaldi ark in tensorflow
☆17Mar 7, 2018Updated 8 years ago
jagabandhumishra / W2V-E2E-Language-Diarization
View on GitHub
☆11Sep 4, 2023Updated 2 years ago
Prem-kumar27 / Fast-KTSpeechCrawler
View on GitHub
Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler
☆23Mar 21, 2021Updated 5 years ago
TartuNLP / tts_preprocess_et
View on GitHub
Estonian text-to-speech text normalization pipeline
☆14Dec 17, 2025Updated 7 months ago
chinshr / sctk
View on GitHub
Speech Recognition Scoring Toolkit
☆13Sep 30, 2015Updated 10 years ago
DCGM / SoftCTC
View on GitHub
This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135
☆19Mar 7, 2023Updated 3 years ago
k2-fsa / fast_rnnt
View on GitHub
A torch implementation of a recursion which turns out to be useful for RNN-T.
☆149Aug 25, 2023Updated 2 years ago
lumaku / ctc-segmentation
View on GitHub
Segment an audio file and obtain utterance alignments. (Python package)
☆348May 15, 2024Updated 2 years ago
farisalasmary / wav2vec2-kenlm
View on GitHub
Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding
☆74Oct 11, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
kensho-technologies / pyctcdecode
View on GitHub
A fast and lightweight python-based CTC beam search decoder for speech recognition.
☆469Jul 13, 2023Updated 3 years ago
lorenlugosch / transducer-tutorial
View on GitHub
Example code for a neural transducer model.
☆68Feb 10, 2024Updated 2 years ago
belambert / asr-evaluation
View on GitHub
Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).
☆284Aug 15, 2023Updated 2 years ago
AppleHolic / pytorch_sound
View on GitHub
Sound Related Deep Learning Tasks boosting repository with pytorch
☆88Jul 25, 2024Updated 2 years ago
JRMeyer / easy-kaldi
View on GitHub
Use your data to create a speech recognition system in Kaldi. Fast.
☆65Jan 2, 2020Updated 6 years ago
NickRuiz / power-asr
View on GitHub
Phonetically-Oriented Word Error Rate
☆36May 4, 2019Updated 7 years ago
facebookresearch / voxpopuli
View on GitHub
A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation
☆574Apr 2, 2023Updated 3 years ago
theblackcat102 / edgedict
View on GitHub
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
☆292Aug 5, 2021Updated 4 years ago
lingjzhu / spoken_sent_embedding
View on GitHub
Unsupervised spoken sentence embeddings
☆14Dec 14, 2022Updated 3 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
dobby-seo / korean-speech-recognition-quartznet
View on GitHub
Jasper 기반 양자화된 모델인 Quartznet 한국어 음성인식
☆22Jul 21, 2021Updated 5 years ago
mjansche / thrax
View on GitHub
Read-only unofficial mirror of the OpenGrm Thrax Grammar Development Tools
☆16May 2, 2019Updated 7 years ago
msalhab96 / RNN-Transducer
View on GitHub
PyTorch implementation of Sequence Transduction with Recurrent Neural Networks (RNN-T) speech recognition paper
☆16Mar 4, 2022Updated 4 years ago
SpeechColab / PySpeechColab
View on GitHub
A library of speech gadgets.
☆15Oct 15, 2022Updated 3 years ago
HawkAaron / warp-transducer
View on GitHub
A fast parallel implementation of RNN Transducer.
☆314Jun 7, 2023Updated 3 years ago
HudsonHuang / waveglow_vocoder
View on GitHub
A vocoder that can convert audio to Mel-Spectrogram and reverse with WaveGlow, with GPU.
☆16Feb 9, 2025Updated last year
falabrasil / ufpalign
View on GitHub
👄🇧🇷 Alinhamento fonético forçado em Português Brasileiro
☆13Jul 18, 2025Updated last year
MiuLab / Lattice-Transformer-SLU
View on GitHub
Source code for ASRU 2019 paper "Adapting Pretrained Transformer to Lattices for Spoken Language Understanding"
☆10Jul 8, 2020Updated 6 years ago
lucidrains / conformer
View on GitHub
Implementation of the convolutional module from the Conformer paper, for use in Transformers
☆438May 17, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
snakers4 / cft-contest-2018
View on GitHub
Repository with illustrations for cft-contest-2018
☆12Nov 22, 2018Updated 7 years ago
usnistgov / SCTK
View on GitHub
☆242Nov 27, 2023Updated 2 years ago
OdiaGenAI / Indic_LLM_Resource_Catalog
View on GitHub
A Catalog lists instruction sets, models available for Indic language
☆10Mar 14, 2024Updated 2 years ago
parlance / ctcdecode
View on GitHub
PyTorch CTC Decoder bindings
☆860Apr 4, 2024Updated 2 years ago
burchim / EfficientConformer
View on GitHub
[ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition
☆221Jun 22, 2023Updated 3 years ago
gpu-poor / gramvaani_hindi_asr
View on GitHub
This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge
☆16Mar 26, 2022Updated 4 years ago
keonlee9420 / WaveGrad2
View on GitHub
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
☆68Aug 3, 2021Updated 4 years ago