RicherMans/UIT_Mobile

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/RicherMans/UIT_Mobile)

RicherMans / UIT_Mobile

Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"

☆24

Alternatives and similar repositories for UIT_Mobile

Users that are interested in UIT_Mobile are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RicherMans / PSL
View on GitHub
Source code for ICASSP2022 "Pseudo Strong labels for large scale weakly supervised audio tagging"
☆31Apr 29, 2022Updated 4 years ago
huangyz0918 / kws-continual-learning
View on GitHub
[ICASSP'22] Continual Learning Benchmark for Spoken Keyword Spotting
☆17Jun 7, 2022Updated 4 years ago
RicherMans / SAT
View on GitHub
Streaming Audiotransformers for online Audio tagging
☆57Jun 14, 2024Updated 2 years ago
talhanai / kaldi-diar-latte
View on GitHub
steps to perform text-based speaker diarization with kaldi toolkit
☆12Nov 2, 2018Updated 7 years ago
HolgerBovbjerg / data2vec-KWS
View on GitHub
This repository contains code for applying Data2Vec to pretrain Keyword Transformer model as described in "Improving Label-Deficient Keyw…
☆32Mar 6, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
tiro-is / tiro-speech-core
View on GitHub
This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core
☆15Jun 19, 2023Updated 3 years ago
yucongzh / online_speaker_diarization
View on GitHub
☆15Jul 11, 2022Updated 4 years ago
ryoha000 / librosapp
View on GitHub
A C++ implementation of stft, melspectrogram and mel_to_stft
☆11Jun 2, 2022Updated 4 years ago
FantSun / Speechflow
View on GitHub
Speechflow for emotion recognition related information decomposition
☆10Jul 27, 2021Updated 4 years ago
WelkinYang / EMPHASIS-pytorch
View on GitHub
EMPHASIS: An Emotional Phoneme-based Acoustic Model for Speech Synthesis System
☆15Mar 31, 2019Updated 7 years ago
jingyonghou / KWS_Max-pooling_RHE
View on GitHub
Mining effective negative training samples for keyword spotting (PyTorch)
☆66May 23, 2020Updated 6 years ago
muthissar / diffstm
View on GitHub
☆10Dec 16, 2022Updated 3 years ago
mrusci / ondevice-learning-kws
View on GitHub
Test Framework for few-shot open set KWS
☆45Nov 8, 2024Updated last year
vivian556123 / NeurIPS2024-CoVoMix
View on GitHub
Official repo for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
☆67Jan 16, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
TehreemFarooqi / Preparing-a-speech-recognition-dataset-using-YouTube-videos
View on GitHub
Using YouTube to prepare a speech recognition dataset for any language
☆10Mar 30, 2021Updated 5 years ago
Speech-Lab-IITM / CCC-wav2vec-2.0
View on GitHub
Code for the method proposed in the paper:- ccc-wav2vec 2.0: Clustering aided Cross-Contrastive learning of Self-Supervised speech repres…
☆23Mar 18, 2024Updated 2 years ago
SpeechColab / PySpeechColab
View on GitHub
A library of speech gadgets.
☆15Oct 15, 2022Updated 3 years ago
RicherMans / HEAR2021_EfficientLatent
View on GitHub
Submission to the HEAR2021 Challenge
☆17Mar 5, 2022Updated 4 years ago
patrickvonplaten / Wav2Vec2_ParlanceCTCDecode
View on GitHub
☆11Nov 5, 2021Updated 4 years ago
i3thuan5 / FaNT
View on GitHub
Filtering and Noise Adding Tool
☆29May 27, 2022Updated 4 years ago
revdotcom / words2num
View on GitHub
Convert words to numbers
☆21Apr 13, 2022Updated 4 years ago
kimsunwiub / BLOOM-Net
View on GitHub
Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"
☆14Feb 13, 2022Updated 4 years ago
bagustris / ccc_mse_ser
View on GitHub
Repository for my paper: Evaluation of Error and Correlation-Based Loss Functions For Multitask Learning Dimensional Speech Emotion Recog…
☆20Mar 13, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
lijuncheng16 / AudioTaggingDoneRight
View on GitHub
experiments about AudioSet
☆43Jul 22, 2023Updated 3 years ago
Hunterhuan / sphereface2_speaker_verification
View on GitHub
Exploring Binary Classification Loss for Speaker Verification
☆18Jul 18, 2023Updated 3 years ago
mikex86 / DeepSpeech-Java-Bindings
View on GitHub
Java Bindings for the C++ library DeepSpeech
☆10Jun 4, 2020Updated 6 years ago
Chung-I / youtube-asr-crawler
View on GitHub
☆10Sep 19, 2022Updated 3 years ago
NieeiM / Dasheng-Audiogen
View on GitHub
Generate a complete audio clip with music, intelligible speech, and sound effects from text in one pass.
☆44May 27, 2026Updated last month
sil-ai / tts-singlish
View on GitHub
TTS for Singlish using Tacotron2, the IMDA corpus, and Pachyderm.
☆11Jan 11, 2020Updated 6 years ago
bagustris / ssl-ser
View on GitHub
Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"
☆10Mar 15, 2023Updated 3 years ago
RicherMans / Dasheng
View on GitHub
Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"
☆86Nov 7, 2025Updated 8 months ago
ShoukanLabs / VoPho
View on GitHub
A collection of all our phonemeizers for dataset construction and inference
☆30Feb 21, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
kamperh / globalphone_awe
View on GitHub
Multilingual acoustic word embedding approaches applied and evaluated on GlobalPhone data.
☆11Nov 3, 2020Updated 5 years ago
aispeech-lab / w2v-cif-bert
View on GitHub
☆37Jun 28, 2021Updated 5 years ago
igormq / ctcdecode-pytorch
View on GitHub
Python implementation of CTC beam search decoder + agnostic LM scorer
☆20Dec 16, 2020Updated 5 years ago
ArchitParnami / Few-Shot-KWS
View on GitHub
Few-Shot Keyword Spotting
☆73Apr 11, 2021Updated 5 years ago
yuhangear / wenet-android
View on GitHub
☆13Oct 27, 2021Updated 4 years ago
xiaomi-research / acavcaps
View on GitHub
☆31Mar 27, 2026Updated 3 months ago
poleval / 2021-punctuation-restoration
View on GitHub
PolEval 2021 Task 1
☆15Jun 28, 2022Updated 4 years ago