Code for the Interspeech 2024 paper "MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting"
☆48Jan 24, 2026Updated 2 months ago
Alternatives and similar repositories for MM-KWS
Users that are interested in MM-KWS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official implementation of "PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords" (INTERSPEECH 2023)☆60Jun 3, 2024Updated last year
- Official code for Metric learning for user-defined keyword spotting☆39Feb 21, 2024Updated 2 years ago
- Recipe for LibriPhrase☆36Sep 2, 2023Updated 2 years ago
- Test-time adaptation for speech recognition model by single utterance. The official implementation of "Listen, Adapt, Better WER: Source-…☆22Apr 1, 2022Updated 4 years ago
- Test Framework for few-shot open set KWS☆42Nov 8, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆32Aug 10, 2022Updated 3 years ago
- E2E ASR system☆14Oct 20, 2022Updated 3 years ago
- Collection of PyTorch implementations of Spoken Keyword Spotting presented in research papers.☆38Apr 5, 2024Updated 2 years ago
- [Tiny KWS] SparkNet: Sparse Binarization for Fast Keyword Spotting☆17Aug 26, 2025Updated 7 months ago
- End-to-End Speech Processing Toolkit☆15Jan 20, 2025Updated last year
- Pytorch implementation of BiFSMNv2, TNNLS 2023☆35Feb 10, 2023Updated 3 years ago
- PyTorch reimplementation of "Keyword Transformer: A Self-Attention Model for Keyword Spotting"☆16Jul 23, 2021Updated 4 years ago
- Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus☆187Dec 6, 2024Updated last year
- This repository contains code for applying Data2Vec to pretrain Keyword Transformer model as described in "Improving Label-Deficient Keyw…☆31Mar 6, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Official implementation of Efficient Speech Separation Framework Based on Neural State-Space Models☆26Feb 25, 2026Updated last month
- Production First and Production Ready End-to-End Keyword Spotting Toolkit☆709Sep 17, 2025Updated 7 months ago
- ☆90May 31, 2023Updated 2 years ago
- ☆22Aug 25, 2025Updated 7 months ago
- This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).☆285May 23, 2022Updated 3 years ago
- Few-Shot Keyword Spotting☆72Apr 11, 2021Updated 5 years ago
- ☆25Aug 29, 2025Updated 7 months ago
- Official repository of Fast-ULCNet.☆29Feb 4, 2026Updated 2 months ago
- This is a repository for a paper accepted at the 2022 IEEE Spoken Language Technology Workshop (SLT 2022)☆16Dec 1, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- PrimeK-Net official code☆27Mar 5, 2025Updated last year
- Zero-Shot Blind Audio Bandwidth Extension☆27May 25, 2023Updated 2 years ago
- [Not Official] Implementation of TC-Resnet, INTERSPEECH 2019☆22Jan 24, 2024Updated 2 years ago
- This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…☆11Feb 4, 2020Updated 6 years ago
- ☆11Oct 24, 2022Updated 3 years ago
- offical code for Dense-TSNet☆12Sep 17, 2024Updated last year
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆22May 26, 2025Updated 10 months ago
- A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors☆25Jul 30, 2025Updated 8 months ago
- Perceptual Contrast Stretching on Target Feature for Speech Enhancement (Accepted by INTERSPEECH 2022)☆73May 11, 2024Updated last year
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Model configurations for scaling SE models in the paper "Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enha…☆38Aug 7, 2024Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 9 months ago
- Keyword spotting and forced alignment in any language☆94Feb 12, 2026Updated 2 months ago
- DiffPhase: Generative Diffusion-based STFT Phase Retrieval☆16Sep 21, 2023Updated 2 years ago
- FNSE-SBGAN: Far-field Speech Enhancement with Schrödinger Bridge and Generative Adversarial Networks☆18May 12, 2025Updated 11 months ago
- End-to-end ASR repository for AGI☆20Dec 19, 2025Updated 4 months ago
- PyTorch based toolkit for developing spiking neural networks (SNNs) by training and testing them on speech command recognition tasks☆30May 3, 2024Updated last year