mispchallenge/misp2021_baseline

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mispchallenge/misp2021_baseline)

mispchallenge / misp2021_baseline

☆29

Alternatives and similar repositories for misp2021_baseline

Users that are interested in misp2021_baseline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mispchallenge / MISP2021-AVSR
View on GitHub
repository for paper "Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis"
☆18Jun 17, 2022Updated 4 years ago
uark-cviu / Right2Talk
View on GitHub
[ICCV'21] The Right to Talk: An Audio-Visual Transformer Approach
☆20Aug 2, 2021Updated 4 years ago
ms-dot-k / LRW_ID
View on GitHub
The speaker-labeled information of LRW dataset, which is the outcome of the paper "Speaker-adaptive Lip Reading with User-dependent Paddi…
☆10Oct 12, 2023Updated 2 years ago
desh2608 / kaldi-noise-vectors
View on GitHub
Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.
☆13Feb 13, 2021Updated 5 years ago
jingyonghou / KWS_Max-pooling_RHE
View on GitHub
Mining effective negative training samples for keyword spotting (PyTorch)
☆66May 23, 2020Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
janson9192 / autokws2021
View on GitHub
☆13Mar 25, 2021Updated 5 years ago
VKW2021 / kaldi-baseline
View on GitHub
kaldi cnn-tdnnf baseline
☆13Aug 31, 2021Updated 4 years ago
Mashiro009 / wenet-online-decoder-onnx
View on GitHub
☆40Aug 15, 2021Updated 4 years ago
R1ckShi / FrontEnd-AEC
View on GitHub
Acoustic echo cancelation(AEC) is a main algorithm in the pipe line of acoustic devices with KWS or ASR. FNLMS is used.
☆19Apr 22, 2019Updated 7 years ago
snsun / kaldi-decoder-code-reading
View on GitHub
☆33Oct 28, 2022Updated 3 years ago
mispchallenge / misp2022_baseline
View on GitHub
☆33Jun 26, 2023Updated 3 years ago
danpovey / quantization
View on GitHub
Torch-based tool for quantizing high-dimensional vectors using additive codebooks
☆54May 25, 2022Updated 4 years ago
xing96 / MIM-lipreading
View on GitHub
Code and model for paper <Mutual Information Maximization for Effective Lip Reading>
☆19Sep 4, 2020Updated 5 years ago
VIPL-Audio-Visual-Speech-Understanding / VIPL-AVSU-Group
View on GitHub
Collection of works from VIPL-AVSU
☆50Updated this week
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
mborsdorf / UniversalSpeakerExtraction
View on GitHub
☆15Sep 6, 2021Updated 4 years ago
Ephrem-ETH / E2E-KWS
View on GitHub
End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM
☆45Nov 18, 2022Updated 3 years ago
jiay7 / wenet_onlinedecode
View on GitHub
Went online decode demo
☆31Apr 28, 2021Updated 5 years ago
nttcslab-sp / torchain
View on GitHub
WIP: pytorch FFI wrapper for Kaldi chain loss (a.k.a. Lattice Free MMI)
☆20Feb 20, 2019Updated 7 years ago
mispchallenge / MISP-ICME-AVSR
View on GitHub
☆17Jan 1, 2024Updated 2 years ago
TeaPoly / CTC-OptimizedLoss
View on GitHub
Computes the MWER (minimum WER) Loss with CTC beam search. Knowledge distillation for CTC loss.
☆59Sep 6, 2023Updated 2 years ago
Edresson / GE2E-Speaker-Encoder
View on GitHub
GE2E Speaker Encoder - Generalized End-To-End Loss for Speaker Verification
☆14May 17, 2020Updated 6 years ago
whzikaros / g2pL
View on GitHub
The implementation of g2pL with a new open dataset.
☆16May 14, 2023Updated 3 years ago
Toshiba-China-RDC / dcase20_task4
View on GitHub
Couple learning on baseline of DCASE 2020 task 4
☆25Mar 9, 2022Updated 4 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
swagshaw / Rainbow-Keywords
View on GitHub
Rainbow Keywords - Official PyTorch Implementation
☆14Jun 27, 2024Updated 2 years ago
luomingshuang / k2-speechbrain
View on GitHub
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Jun 17, 2022Updated 4 years ago
tencent-ailab / 3m-asr
View on GitHub
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition
☆119Jun 22, 2022Updated 4 years ago
HLTCHKUST / CI-AVSR
View on GitHub
Code repository for the Cantonese In-car Audio-Visual Speech Recognition (CI-AVSR) dataset.
☆42Jul 16, 2024Updated 2 years ago
Tencent / StableToken
View on GitHub
[ICLR 2026] StableToken: A state-of-the-art noise-robust semantic speech tokenizer featuring Voting-LFQ for resilient SpeechLLMs.
☆33Feb 27, 2026Updated 4 months ago
kaituoxu / kaldi-ktnet1
View on GitHub
Kaldi extended by Kaituo XU with new features in nnet1.
☆12Dec 16, 2018Updated 7 years ago
tqbl / arca23k-dataset
View on GitHub
The code used to create the ARCA23K and ARCA23K-FSD datasets
☆16Nov 9, 2021Updated 4 years ago
SpeechColab / PySpeechColab
View on GitHub
A library of speech gadgets.
☆15Oct 15, 2022Updated 3 years ago
igormq / ctcdecode-pytorch
View on GitHub
Python implementation of CTC beam search decoder + agnostic LM scorer
☆20Dec 16, 2020Updated 5 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
shengcanxu / canoSpeech
View on GitHub
text to speech
☆10Mar 19, 2024Updated 2 years ago
jctian98 / e2e_lfmmi
View on GitHub
E2E system with LF-MMI; word N-gram for Mandarin
☆167Apr 29, 2022Updated 4 years ago
csukuangfj / optimized_transducer
View on GitHub
Memory efficient transducer loss computation
☆70Jun 10, 2022Updated 4 years ago
speechio / BigCiDian
View on GitHub
Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.
☆263Oct 11, 2019Updated 6 years ago
l3das / L3DAS22
View on GitHub
☆57Jun 4, 2022Updated 4 years ago
deeplsd / Merkel-Podcast-Corpus
View on GitHub
This dataset is presented in the paper Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video…
☆12Sep 21, 2022Updated 3 years ago
Interlagos / TENet-kws
View on GitHub
Tensorflow implementation of "Small-Footprint Keyword Spotting with Multi-Scale Temporal Convolution"(INTERSPEECH 2020)
☆32Nov 11, 2020Updated 5 years ago