VKW2021/kaldi-baseline

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/VKW2021/kaldi-baseline)

VKW2021 / kaldi-baseline

kaldi cnn-tdnnf baseline

☆13

Alternatives and similar repositories for kaldi-baseline

Users that are interested in kaldi-baseline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LCF2764 / autoKWS2021_1st_solution
View on GitHub
Auto-KWS 2021 Challenge 1st place solution.
☆11Jul 20, 2021Updated 5 years ago
usnistgov / F4DE
View on GitHub
Framework for Detection Evaluation (F4DE) : set of evaluation tools for detection evaluations and for specific NIST-coordinated evaluatio…
☆26Jul 6, 2017Updated 9 years ago
Ephrem-ETH / E2E-KWS
View on GitHub
End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM
☆45Nov 18, 2022Updated 3 years ago
jctian98 / e2e_lfmmi
View on GitHub
E2E system with LF-MMI; word N-gram for Mandarin
☆167Apr 29, 2022Updated 4 years ago
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
lucadellalib / ts-asr
View on GitHub
Target speaker automatic speech recognition (TS-ASR)
☆14Oct 14, 2023Updated 2 years ago
BUTSpeechFIT / hystoc
View on GitHub
Getting confidences from any end-to-end systems
☆11May 24, 2023Updated 3 years ago
pigzach / MagicSpeechASR
View on GitHub
magicspeech competition recipe
☆18Jun 29, 2020Updated 6 years ago
Yaoming95 / UniPunc
View on GitHub
The case study and multilingfual performance of ICASSP submission
☆24Sep 24, 2022Updated 3 years ago
tencent-ailab / 3m-asr
View on GitHub
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
☆119Jun 22, 2022Updated 4 years ago
yichen14 / FastAdaSP
View on GitHub
Code for the paper "FastAdaSP: An Efficient Multitask Inference Framework for Large Speech Language Models". @ EMNLP'24(Oral)
☆17Nov 14, 2024Updated last year
mispchallenge / misp2021_baseline
View on GitHub
☆29Jun 15, 2022Updated 4 years ago
Xianchao-Wu / wenet-deep-sparse-conformer
View on GitHub
☆15Aug 25, 2022Updated 3 years ago
igormq / ctcdecode-pytorch
View on GitHub
Python implementation of CTC beam search decoder + agnostic LM scorer
☆20Dec 16, 2020Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
lallubharteja / KWS-Scripts
View on GitHub
Keyword Search Recipe for Subword ASR
☆30Jul 12, 2019Updated 7 years ago
rithiksachdev / PostASR-Correction-SLT2024
View on GitHub
☆18Jul 22, 2024Updated last year
tzyll / ChineseHP
View on GitHub
Dataset for Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models in Interspeech 2024.
☆16Jul 4, 2024Updated 2 years ago
cadia-lvl / samromur-asr
View on GitHub
Automatic Speech Recognition (ASR) system for the Samrómur speech corpus using Kaldi
☆12Sep 30, 2022Updated 3 years ago
burchim / EfficientConformer
View on GitHub
[ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition
☆221Jun 22, 2023Updated 3 years ago
hhhaaahhhaa / ASR-TTA
View on GitHub
☆16Nov 4, 2025Updated 8 months ago
keon / cpp-pytorch
View on GitHub
C++ PyTorch Examples
☆10Aug 18, 2019Updated 6 years ago
DeepSpectrum / DeepSpectrumLite
View on GitHub
Light-weight transfer learning framework for on-device speech and audio recognition using pre-trained image convolutional neural networks…
☆18Apr 16, 2022Updated 4 years ago
popcornell / MicRank
View on GitHub
MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.
☆22Apr 8, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Curzibn / yry
View on GitHub
☆18Mar 18, 2020Updated 6 years ago
YUCHEN005 / RATS-Channel-A-Speech-Data
View on GitHub
This is a public repository for RATS Channel-A Speech Data, which is a chargeable noisy speech dataset under LDC. Here we release its Log…
☆16Oct 22, 2022Updated 3 years ago
jongwook / crepe
View on GitHub
☆12Jun 5, 2018Updated 8 years ago
cywang97 / StreamingTransformer
View on GitHub
☆277Jan 15, 2021Updated 5 years ago
farisalasmary / wav2vec2-kenlm
View on GitHub
Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding
☆74Oct 11, 2021Updated 4 years ago
janson9192 / autokws2021
View on GitHub
☆13Mar 25, 2021Updated 5 years ago
bagustris / w2v2-vad
View on GitHub
A wrapper for Audeering's wav2vec-based dimensional speech emotion recognition
☆22Aug 9, 2023Updated 2 years ago
xk-wang / MusicYOLO
View on GitHub
MusicYOLO framework uses the object detection model, YOLOx, to locate notes in the spectrogram.
☆11Jan 29, 2022Updated 4 years ago
homink / speech.ko
View on GitHub
Korean read speech corpus (about 120 hours, 17GB) from National Institute of Korean Language
☆43Feb 28, 2018Updated 8 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Ming-er / Audio-Free-P-Tuning
View on GitHub
☆11Dec 28, 2023Updated 2 years ago
Alibaba-NLP / AISHELL-NER
View on GitHub
[ICASSP 2022] AISHELL-NER: Named Entity Recognition from Chinese Speech
☆26Apr 20, 2022Updated 4 years ago
celebrity-audio-collection / videoprocess
View on GitHub
CN-Celeb, a large-scale Chinese celebrities dataset published by Center for Speech and Language Technology (CSLT) at Tsinghua University.
☆80Nov 9, 2019Updated 6 years ago
chimechallenge / chime-utils
View on GitHub
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
☆26Feb 25, 2025Updated last year
ga642381 / SpeechGen
View on GitHub
《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》
☆77Jun 9, 2023Updated 3 years ago
YuanGongND / llm_speech_emotion_challenge
View on GitHub
☆23Jun 24, 2024Updated 2 years ago
talhanai / wer-sigtest
View on GitHub
Script to perform statistical significance test between ASR hypotheses.
☆23Aug 13, 2017Updated 8 years ago