This repository contains code for applying Data2Vec to pretrain Keyword Transformer model as described in "Improving Label-Deficient Keyword Spotting Through Self-Supervised Pretraining".
☆31Mar 6, 2025Updated last year
Alternatives and similar repositories for data2vec-KWS
Users that are interested in data2vec-KWS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Wav2Keyword is keyword spotting(KWS) based on Wav2Vec 2.0. This model shows state-of-the-art in Speech commands dataset V1 and V2.☆109Jan 11, 2023Updated 3 years ago
- Test Framework for few-shot open set KWS☆42Nov 8, 2024Updated last year
- Few-Shot Keyword Spotting☆71Apr 11, 2021Updated 4 years ago
- Few-shot Keyword Spotting in Any Language and Multilingual Spoken Word Corpus☆186Dec 6, 2024Updated last year
- Continual Learning Benchmark for Spoken Keyword Spotting☆17Jun 7, 2022Updated 3 years ago
- Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"☆23Mar 6, 2023Updated 3 years ago
- This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).☆283May 23, 2022Updated 3 years ago
- Learning Efficient Representations for Keyword Spotting with Triplet Loss☆113Sep 14, 2022Updated 3 years ago
- Official code for Metric learning for user-defined keyword spotting☆38Feb 21, 2024Updated 2 years ago
- End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM☆44Nov 18, 2022Updated 3 years ago
- Recipe for LibriPhrase☆35Sep 2, 2023Updated 2 years ago
- Official PyTorch implementation of "Attention-Free Keyword Spotting", Mashrur. M. Morshed & Ahmad Omar Ahsan, PML4DC @ ICLR 2022.☆15Nov 5, 2022Updated 3 years ago
- Official implementation of the Keyword Transformer: https://arxiv.org/abs/2104.00769☆139Apr 29, 2022Updated 3 years ago
- Multilingual and code-switching ASR challenges for low resource Indian languages.☆21Jul 26, 2021Updated 4 years ago
- Keyword spotting, Speech wake_up, by pytorch, DNN, CNN, TDNN, DFSMN, LSTM☆54Mar 15, 2022Updated 4 years ago
- Keyword spotting and forced alignment in any language☆92Feb 12, 2026Updated last month
- Attention-based model for keywords spotting☆19Aug 9, 2021Updated 4 years ago
- Unofficial PyTorch implementation of "Keyword Transformer: A Self-Attention Model for Keyword Spotting", Berg et al. 2021.☆40Oct 11, 2022Updated 3 years ago
- Official implementation for our paper "Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations"☆41Aug 14, 2025Updated 7 months ago
- Production First and Production Ready End-to-End Keyword Spotting Toolkit☆699Sep 17, 2025Updated 6 months ago
- Mining effective negative training samples for keyword spotting (PyTorch)☆64May 23, 2020Updated 5 years ago
- Seeing Wake Words: Audio-visual Keyword Spotting☆66Sep 16, 2020Updated 5 years ago
- Feedforward Sequential Memory Networks☆16Aug 2, 2022Updated 3 years ago
- Code for the Interspeech 2024 paper "MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting"☆47Jan 24, 2026Updated 2 months ago
- Official implementation of "PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords" (INTERSPEECH 2023)☆59Jun 3, 2024Updated last year
- TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR.☆26Jun 1, 2023Updated 2 years ago
- Official Implementation of LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models.☆33Nov 9, 2025Updated 4 months ago
- PyTorch toolkit for streaming speech recognition, speech translation and simultaneous translation based on fairseq.☆25Oct 3, 2022Updated 3 years ago
- This is a repository for a paper accepted at the 2022 IEEE Spoken Language Technology Workshop (SLT 2022)☆16Dec 1, 2022Updated 3 years ago
- Speechflow for emotion recognition related information decomposition☆10Jul 27, 2021Updated 4 years ago
- A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIV…☆21Nov 25, 2024Updated last year
- ☆135Sep 23, 2020Updated 5 years ago
- NUS ME5413 Autonomous Mobile Robotics Final Project☆18Apr 6, 2025Updated 11 months ago
- ☆32Aug 10, 2022Updated 3 years ago
- Compendium for the paper "Transparent pronunciation scoring using articulatorily weighted phoneme edit distance" by Karhila, Smolander, Y…☆25May 6, 2019Updated 6 years ago
- The case study and multilingfual performance of ICASSP submission☆24Sep 24, 2022Updated 3 years ago
- Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised …☆138Jan 20, 2024Updated 2 years ago
- An extension of thu-spmi/CAT which contains a full-fledged implementation of CTC-CRF for Tensorflow.☆12Jul 5, 2021Updated 4 years ago
- ☆37Mar 30, 2021Updated 4 years ago