k2-fsa/next-gen-kaldi-wechat

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/k2-fsa/next-gen-kaldi-wechat)

k2-fsa / next-gen-kaldi-wechat

☆40

Alternatives and similar repositories for next-gen-kaldi-wechat

Users that are interested in next-gen-kaldi-wechat are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

k2-fsa / multi_quantization
View on GitHub
☆46Nov 2, 2023Updated 2 years ago
k2-fsa / colab
View on GitHub
Colab notebooks for Next-gen Kaldi
☆31Oct 12, 2025Updated 9 months ago
csukuangfj / transducer-loss-benchmarking
View on GitHub
☆67Mar 25, 2022Updated 4 years ago
luomingshuang / k2-speechbrain
View on GitHub
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Jun 17, 2022Updated 4 years ago
k2-fsa / kaldifst
View on GitHub
Python wrapper for OpenFST and its extensions from Kaldi. Also support reading/writing ark/scp files
☆56Apr 9, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
csukuangfj / kaldi-hmm-gmm
View on GitHub
☆28Apr 24, 2026Updated 2 months ago
kjw11 / Speaker-Aware-CTC
View on GitHub
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
☆22May 26, 2025Updated last year
csukuangfj / optimized_transducer
View on GitHub
Memory efficient transducer loss computation
☆70Jun 10, 2022Updated 4 years ago
bshall / dusted
View on GitHub
DUSTED: Spoken-Term Discovery using Discrete Speech Units
☆17Oct 2, 2024Updated last year
k2-fsa / icefall
View on GitHub
☆1,454Updated this week
kjw11 / CSEnet-ASR
View on GitHub
Cross-Speaker Encoding Network for Multi-talker Speech Recognition
☆12Mar 14, 2025Updated last year
talhanai / wer-sigtest
View on GitHub
Script to perform statistical significance test between ASR hypotheses.
☆23Aug 13, 2017Updated 8 years ago
TeaPoly / CTC-OptimizedLoss
View on GitHub
Computes the MWER (minimum WER) Loss with CTC beam search. Knowledge distillation for CTC loss.
☆59Sep 6, 2023Updated 2 years ago
k2-fsa / fast_rnnt
View on GitHub
A torch implementation of a recursion which turns out to be useful for RNN-T.
☆149Aug 25, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
awni / future_speech
View on GitHub
The History of Speech Recognition to the Year 2030
☆13Aug 14, 2021Updated 4 years ago
k2-fsa / k2
View on GitHub
FSA/FST algorithms, differentiable, with PyTorch compatibility.
☆1,348Jul 11, 2026Updated last week
thu-spmi / CAT
View on GitHub
CAT is more than a CRF-based ASR toolkit: it provides a complete workflow for data-efficient end-to-end ASR, supporting CTC, CTC-CRF, RNN…
☆368Feb 5, 2026Updated 5 months ago
thu-spmi / CTC-TTS
View on GitHub
Code for CTC-TTS: LLM-based dual-streaming text-to-speech with CTC alignment, Interspeech 2026.
☆20Jun 9, 2026Updated last month
ductuantruong / speaker_age_estimation_ssl_study
View on GitHub
[APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models
☆14Oct 19, 2022Updated 3 years ago
edemattos / asr
View on GitHub
Automatic Speech Recognition at the University of Edinburgh.
☆16Mar 14, 2021Updated 5 years ago
PhonemeHallucinator / Phoneme_Hallucinator
View on GitHub
☆48Aug 16, 2023Updated 2 years ago
SpeechColab / PySpeechColab
View on GitHub
A library of speech gadgets.
☆15Oct 15, 2022Updated 3 years ago
nii-yamagishilab / speaker_sex_attribute_privacy
View on GitHub
Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE
☆15Nov 30, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
csukuangfj / kaldi_native_io
View on GitHub
python wrapper for kaldi's native I/O
☆27Jan 9, 2025Updated last year
cyfer0618 / kaldi-pytorch-rnnlm
View on GitHub
Enable RNNLM lattice rescoring with Pytorch [kaldi]
☆12Jun 5, 2020Updated 6 years ago
jctian98 / e2e_lfmmi
View on GitHub
E2E system with LF-MMI; word N-gram for Mandarin
☆167Apr 29, 2022Updated 4 years ago
csukuangfj / kaldilm
View on GitHub
Python wrapper for kaldi's arpa2fst
☆38Aug 27, 2025Updated 10 months ago
TeaPoly / warp-ctc-crf
View on GitHub
An extension of thu-spmi/CAT which contains a full-fledged implementation of CTC-CRF for Tensorflow.
☆12Jul 5, 2021Updated 5 years ago
Hannes1 / react-native-wenet
View on GitHub
Wenet speech to text for react native
☆10Nov 1, 2022Updated 3 years ago
prairie-schooner / wav2vec-vc
View on GitHub
☆10Mar 22, 2023Updated 3 years ago
microsoft / UniSpeech
View on GitHub
UniSpeech - Large Scale Self-Supervised Learning for Speech
☆486Apr 5, 2024Updated 2 years ago
vectominist / MiniASR
View on GitHub
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
☆53Dec 6, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
k2-fsa / libriheavy
View on GitHub
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
☆220Sep 10, 2024Updated last year
miccio-dk / NISQA
View on GitHub
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Apr 13, 2022Updated 4 years ago
amazon-science / iwslt-autodub-task
View on GitHub
☆21Mar 4, 2024Updated 2 years ago
fengpeng-yue / ASRTTS
View on GitHub
ASR & TTS joint training, asr, tts, machine speech chain
☆16Oct 16, 2021Updated 4 years ago
fchest / Speech-Transformer-multi-GPUs
View on GitHub
A PyTorch implementation of Speech Transformer with multi-GPUs, an End-to-End ASR with Transformer network on Mandarin Chinese. This code…
☆10Dec 25, 2019Updated 6 years ago
RS2002 / Adversarial-MidiBERT
View on GitHub
[ICMR 2025] Official Repository for The Paper, Let Network Decide What to Learn: Symbolic Music Understanding Model Based on Large-scale …
☆18Aug 17, 2025Updated 11 months ago
FantSun / Speechflow
View on GitHub
Speechflow for emotion recognition related information decomposition
☆10Jul 27, 2021Updated 4 years ago