Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"
☆23Mar 6, 2023Updated 3 years ago
Alternatives and similar repositories for UIT_Mobile
Users that are interested in UIT_Mobile are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Source code for ICASSP2022 "Pseudo Strong labels for large scale weakly supervised audio tagging"☆31Apr 29, 2022Updated 3 years ago
- Continual Learning Benchmark for Spoken Keyword Spotting☆17Jun 7, 2022Updated 3 years ago
- Streaming Audiotransformers for online Audio tagging☆53Jun 14, 2024Updated last year
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- This repository contains code for applying Data2Vec to pretrain Keyword Transformer model as described in "Improving Label-Deficient Keyw…☆31Mar 6, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core☆15Jun 19, 2023Updated 2 years ago
- Test Framework for few-shot open set KWS☆42Nov 8, 2024Updated last year
- ☆15Jul 11, 2022Updated 3 years ago
- Speechflow for emotion recognition related information decomposition☆10Jul 27, 2021Updated 4 years ago
- A C++ implementation of stft, melspectrogram and mel_to_stft☆10Jun 2, 2022Updated 3 years ago
- [Tiny KWS] SparkNet: Sparse Binarization for Fast Keyword Spotting☆17Aug 26, 2025Updated 7 months ago
- Official PyTorch inference code for the Interspeech 2025 paper: Efficient Speech Enhancement via Embeddings from Pre-trained Generative A…☆76Jun 16, 2025Updated 9 months ago
- EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System☆15Mar 31, 2019Updated 6 years ago
- Mining effective negative training samples for keyword spotting (PyTorch)☆64May 23, 2020Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆10Dec 16, 2022Updated 3 years ago
- Official repo for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations☆64Jan 16, 2025Updated last year
- Official Implementation of GLAP - General Language Audio Pretraining☆65Jan 5, 2026Updated 2 months ago
- A library of speech gadgets.☆14Oct 15, 2022Updated 3 years ago
- Using YouTube to prepare a speech recognition dataset for any language☆10Mar 30, 2021Updated 4 years ago
- Official PyTorch code for Deep Audio-Signal Holistic Embeddings☆187Nov 7, 2025Updated 4 months ago
- Python runtime for WeTextProcessing (does not depend on Pynini)☆49Nov 28, 2025Updated 3 months ago
- A collection of all our phonemeizers for dataset construction and inference☆28Feb 21, 2025Updated last year
- Submission to the HEAR2021 Challenge☆17Mar 5, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆11Nov 5, 2021Updated 4 years ago
- Filtering and Noise Adding Tool☆29May 27, 2022Updated 3 years ago
- Convert words to numbers☆21Apr 13, 2022Updated 3 years ago
- Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"☆14Feb 13, 2022Updated 4 years ago
- readers that enable reading kaldi ark in tensorflow☆17Mar 7, 2018Updated 8 years ago
- Exploring Binary Classification Loss for Speaker Verification☆18Jul 18, 2023Updated 2 years ago
- Repository for my paper: Evaluation of Error and Correlation-Based Loss Functions For Multitask Learning Dimensional Speech Emotion Recog…☆20Mar 13, 2024Updated 2 years ago
- experiments about AudioSet☆43Jul 22, 2023Updated 2 years ago
- This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024☆38Feb 5, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Java Bindings for the C++ library DeepSpeech☆10Jun 4, 2020Updated 5 years ago
- ☆10Sep 19, 2022Updated 3 years ago
- [ICASSP2023] Source code, model links and open test sets for paper SeACo-Paraformer.☆44Mar 15, 2024Updated 2 years ago
- Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…☆23Mar 18, 2024Updated 2 years ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Mar 15, 2023Updated 3 years ago
- TTS for Singlish using Tacotron2, the IMDA corpus, and Pachyderm.☆11Jan 11, 2020Updated 6 years ago
- Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.☆11Nov 3, 2020Updated 5 years ago