haoheliu/ontology-aware-audio-tagging

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/haoheliu/ontology-aware-audio-tagging)

haoheliu / ontology-aware-audio-tagging

☆14

Alternatives and similar repositories for ontology-aware-audio-tagging

Users that are interested in ontology-aware-audio-tagging are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Koziev / StressModel
View on GitHub
Neural model for prediction of stress position in Russian words
☆13Jun 22, 2025Updated last year
sarulab-speech / ml-audiocaps
View on GitHub
Multi-lingual AudioCaps
☆14Nov 20, 2023Updated 2 years ago
torchopenl3 / torchopenl3
View on GitHub
☆20Aug 26, 2022Updated 3 years ago
xavierfav / coala
View on GitHub
COALA: Co-Aligned Autoencoders for Learning Semantically Enriched Audio Representations
☆48Jul 25, 2024Updated last year
RicherMans / PSL
View on GitHub
Source code for ICASSP2022 "Pseudo Strong labels for large scale weakly supervised audio tagging"
☆31Apr 29, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Yuanbo2020 / Audio-Visual-VAD
View on GitHub
☆13May 9, 2022Updated 4 years ago
roudimit / c2kd
View on GitHub
Code for the C2KD paper (ICASSP 2023)
☆19May 15, 2023Updated 3 years ago
Kowalski1024 / Mi-Go
View on GitHub
Mi-Go is an open-source test framework designed to evaluate and compare the accuracy of speech-to-text models on YouTube dataset.
☆12Jul 2, 2024Updated 2 years ago
cpii-cai / PunCantonese
View on GitHub
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆16Dec 3, 2024Updated last year
michaelneri / unsupervised-audio-anomaly-detection
View on GitHub
Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …
☆11Nov 6, 2024Updated last year
google-deepmind / slowfast_nfnets
View on GitHub
☆30Jun 22, 2022Updated 4 years ago
TTS-Research / PEL-TTS
View on GitHub
☆14Aug 16, 2023Updated 2 years ago
ethman / tagbox
View on GitHub
Steer OpenAI's Jukebox with Music Taggers
☆42Apr 21, 2022Updated 4 years ago
suralmasha / RuTranscript
View on GitHub
Russian phonetical transcription
☆11May 20, 2026Updated last month
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
VITA-Group / Audio-Lottery
View on GitHub
[ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…
☆32Apr 8, 2022Updated 4 years ago
gzhu06 / Cacophony
View on GitHub
Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986
☆49Jan 19, 2026Updated 5 months ago
backspacetg / distilXLSR
View on GitHub
Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model
☆13Mar 30, 2025Updated last year
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Oct 14, 2024Updated last year
tmurakam / felica2money
View on GitHub
パソリを使って電子マネーの明細をOFX形式に変換する
☆16Dec 25, 2021Updated 4 years ago
hearbenchmark / hear-eval-kit
View on GitHub
Evaluation kit for the HEAR Benchmark
☆65Feb 12, 2026Updated 4 months ago
ashi-ta / speechGLUE
View on GitHub
SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.
☆13Jun 2, 2023Updated 3 years ago
seungheondoh / hi_kia
View on GitHub
wake-up word emotion recognition [APSIPA 2022]
☆17Nov 11, 2022Updated 3 years ago
audiolabs / anechoic-noise
View on GitHub
Generator for anechoic, non-stationary noise signals
☆11Aug 12, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
SSYSteve / GRATIS
View on GitHub
☆16Sep 7, 2024Updated last year
Veleslavia / conditioned-u-net
View on GitHub
Conditioned U-Net for Music Source Separation
☆20May 15, 2021Updated 5 years ago
zelaki / DisfluentFA
View on GitHub
A Weakly Supervised Forced Alignment for disluent speech
☆15Nov 12, 2023Updated 2 years ago
lucidrains / CLAP
View on GitHub
Contrastive Language-Audio Pretraining
☆15May 18, 2021Updated 5 years ago
hearbenchmark / hear2021-submitted-models
View on GitHub
Open-source audio embedding models, submitted to the HEAR 2021 challenge
☆11Feb 15, 2026Updated 4 months ago
genisplaja / diffusion-vocal-sep
View on GitHub
Code for "A diffusion-inspired training strategy for singing voice extraction in the waveform domain" (ISMIR 2022)
☆17Feb 16, 2023Updated 3 years ago
AsoSoft / AsoSoft-TTS-Speech-Corpus-for-Central-Kurdish
View on GitHub
AsoSoft Speech Corpus for Central-Kurdish Text-To-Speech
☆23Jun 24, 2022Updated 4 years ago
zhaojw1998 / Query-and-reArrange
View on GitHub
Code and demo for paper: Zhao et al., "Q&A: Query-Based Representation Learning for Multi-Track Symbolic Music re-Arrangement," IJCAI 202…
☆21May 2, 2024Updated 2 years ago
mozilla / murmur
View on GitHub
DEPRECATED - A webapp for collecting speech samples for voice recognition testing and training
☆20May 23, 2019Updated 7 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
RicherMans / SAT
View on GitHub
Streaming Audiotransformers for online Audio tagging
☆57Jun 14, 2024Updated 2 years ago
wangyu / rethink-audio-fsl
View on GitHub
Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)
☆43May 24, 2022Updated 4 years ago
R1ckShi / FrontEnd-AEC
View on GitHub
Acoustic echo cancelation(AEC) is a main algorithm in the pipe line of acoustic devices with KWS or ASR. FNLMS is used.
☆19Apr 22, 2019Updated 7 years ago
thomeou / SALSA-Lite
View on GitHub
This is the public repository for SALSA-Lite features for polyphonic sound event localization and detection using microphone arrays.
☆15Dec 3, 2021Updated 4 years ago
MGitHubL / TMac
View on GitHub
☆14Feb 26, 2024Updated 2 years ago
mt-upc / ZeroSwot
View on GitHub
Pushing the Limits of Zero-shot End-to-End Speech Translation
☆25Dec 12, 2024Updated last year
hvy / chainer-faster-rcnn
View on GitHub
☆10Apr 22, 2016Updated 10 years ago