microsoft/NoAudioCaptioning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/microsoft/NoAudioCaptioning)

microsoft / NoAudioCaptioning

Repository for "Training Audio Captioning Models without Audio"

☆10

Alternatives and similar repositories for NoAudioCaptioning

Users that are interested in NoAudioCaptioning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

soham97 / PAM
View on GitHub
PAM is a no-reference audio quality metric for audio generation tasks
☆77Jul 19, 2024Updated 2 years ago
raymondxyy / strfnet-IS2020
View on GitHub
Official repo for the STRFNet system appeared in INTERSPEECH2020
☆12Mar 6, 2021Updated 5 years ago
microsoft / AudioEntailment
View on GitHub
Audio Entailment: Deductive Reasoning for Audio Understanding
☆17Dec 10, 2024Updated last year
soham97 / ADIFF
View on GitHub
Explaining audio differences using language
☆16Feb 11, 2025Updated last year
frankenliu / LOAE
View on GitHub
☆10Sep 25, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
snap-research / GenAU
View on GitHub
☆53Mar 24, 2026Updated 3 months ago
audio-captioning / audio-captioning-resources
View on GitHub
A list of resources that can help in research for automated audio captioning
☆34Feb 17, 2021Updated 5 years ago
JozefColdenhoff / OpenACE
View on GitHub
☆11Aug 1, 2025Updated 11 months ago
h-munakata / Lighthouse-Wrapper-for-Audio-Moment-Retrieval
View on GitHub
☆13Mar 23, 2026Updated 3 months ago
soham97 / sound_ai_progress
View on GitHub
Tracking states of the arts and recent results (bibliography) on sound tasks.
☆33Jan 10, 2023Updated 3 years ago
sholokhovalexey / online-speaker-clustering
View on GitHub
[ICASSP'23] Online speaker clustering
☆18Feb 22, 2026Updated 4 months ago
apple-yinhan / TQ-SED
View on GitHub
☆23Mar 19, 2025Updated last year
zeyuxie29 / AudioTime
View on GitHub
☆39Jul 4, 2024Updated 2 years ago
minguinho26 / Prefix_AAC_ICASSP2023
View on GitHub
Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"
☆30Dec 6, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
audio-captioning / audio-captioning-papers
View on GitHub
A list of papers about audio captioning
☆78Jul 1, 2022Updated 4 years ago
jaeyeonkim99 / EnCLAP
View on GitHub
Official Implementation of EnCLAP (ICASSP 2024)
☆96Jun 2, 2024Updated 2 years ago
tuanct1997 / Federated-Learning-ASR-based-on-wav2vec-2.0
View on GitHub
☆18Mar 13, 2024Updated 2 years ago
lavendery / AudioComposer
View on GitHub
☆27Sep 10, 2025Updated 10 months ago
wsntxxn / TextToAudioGrounding
View on GitHub
The dataset and baseline code for Text-to-Audio Grounding (TAG)
☆49Oct 23, 2025Updated 8 months ago
soham97 / MTL_Weakly_labelled_audio_data
View on GitHub
Code repo for "Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection"
☆17Nov 9, 2022Updated 3 years ago
interactiveaudiolab / VocalImitationSet
View on GitHub
☆18Oct 16, 2018Updated 7 years ago
ckyang1124 / LALM-Evaluation-Survey
View on GitHub
Collection of works for evaluating (and analyzing) large audio-language models (LALMs)
☆41Aug 11, 2025Updated 11 months ago
soham97 / mellow
View on GitHub
small audio language model for reasoning
☆88Dec 4, 2025Updated 7 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
lysanderism / TimeAudio
View on GitHub
The official repository TimeAudio, a comprehensive framework that incorporates fine-grained acoustic cues into LALMs with enhanced module…
☆30Nov 18, 2025Updated 8 months ago
Lhx94As / PHO-LID
View on GitHub
PHO-LID: A Unified Model to Incorporate Acoustic-Phonetic and Phonotactic Information for Language Identification
☆21Aug 24, 2023Updated 2 years ago
Sreyan88 / ReCLAP
View on GitHub
☆33Dec 23, 2025Updated 6 months ago
TigreGotico / chatterbox-onnx
View on GitHub
chatterbox TTS + Voice Clone using onnx
☆28Updated this week
swagshaw / Rainbow-Keywords
View on GitHub
Rainbow Keywords - Official PyTorch Implementation
☆14Jun 27, 2024Updated 2 years ago
seungheondoh / msu-benchmark
View on GitHub
music semantic understanding evaluation benchmark
☆24Aug 12, 2023Updated 2 years ago
naver / multilingual-distilwhisper
View on GitHub
This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.
☆34Apr 22, 2026Updated 2 months ago
GasserElbanna / serab-byols
View on GitHub
(Hybrid) BYOL-S feature extractor using serab-byols package in pytorch.
☆27Apr 20, 2024Updated 2 years ago
Ming-er / MGA-CLAP
View on GitHub
official implementation of MGA-CLAP (ACM MM 2024)
☆29Oct 25, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
skit-ai / Map-Mix
View on GitHub
The official implementation of the method discussed in the paper Improving Spoken Language Identification with Map-Mix(work accepted at I…
☆18Feb 17, 2023Updated 3 years ago
manoskary / SMUG-Explain
View on GitHub
A Framework for Symbolic MUsic Graph Explanations
☆11Jul 30, 2025Updated 11 months ago
JinhuaLiang / APT
View on GitHub
☆20Mar 12, 2025Updated last year
pbzweihander / markdown-toc
View on GitHub
Table of Contents generator for Markdown. Written in Rust.
☆19Jan 9, 2024Updated 2 years ago
dj-shin / army-client
View on GitHub
☆10Jan 6, 2021Updated 5 years ago
soumimaiti / speechlmscore_tool
View on GitHub
☆34Nov 24, 2024Updated last year
diggerdu / AudioMamba
View on GitHub
☆12Jun 1, 2024Updated 2 years ago