microsoft/AudioEntailment

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/microsoft/AudioEntailment)

microsoft / AudioEntailment

Audio Entailment: Deductive Reasoning for Audio Understanding

☆17

Alternatives and similar repositories for AudioEntailment

Users that are interested in AudioEntailment are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Sreyan88 / CompA
View on GitHub
Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models
☆23Jul 10, 2024Updated 2 years ago
soham97 / ADIFF
View on GitHub
Explaining audio differences using language
☆16Feb 11, 2025Updated last year
microsoft / NoAudioCaptioning
View on GitHub
Repository for "Training Audio Captioning Models without Audio"
☆10Sep 26, 2023Updated 2 years ago
HanxunH / AudioMosaic
View on GitHub
[ICML2026] AudioMosaic: Contrastive Masked Audio Representation Learning
☆22May 15, 2026Updated 2 months ago
soham97 / mellow
View on GitHub
small audio language model for reasoning
☆88Dec 4, 2025Updated 7 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
Labbeti / aac-metrics
View on GitHub
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
☆75Mar 22, 2026Updated 4 months ago
kuan2jiu99 / audio-hallucination
View on GitHub
Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024
☆34Mar 14, 2025Updated last year
ETH-DISCO / sao-instruct
View on GitHub
Official repo for SAO-Instruct: Free-form Audio Editing using Natural Language Instructions presented at NeurIPS 2025
☆17Oct 28, 2025Updated 8 months ago
frankenliu / LOAE
View on GitHub
☆10Sep 25, 2024Updated last year
kyegomez / AudioFlamingo
View on GitHub
Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dial…
☆39Jan 27, 2025Updated last year
snap-research / GenAU
View on GitHub
☆53Mar 24, 2026Updated 3 months ago
etzinis / biased_separation
View on GitHub
Code for the paper: Unified Gradient Reweighting for Model Biasing with Applications to Source Separation
☆14Nov 16, 2020Updated 5 years ago
MLSpeech / FormantsTracker
View on GitHub
☆15May 26, 2026Updated last month
Bai-YT / ConsistencyTTA
View on GitHub
ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
☆39Nov 20, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
soham97 / PAM
View on GitHub
PAM is a no-reference audio quality metric for audio generation tasks
☆77Jul 19, 2024Updated 2 years ago
Ruiqi-Yan / Awesome-Audio-Editing
View on GitHub
A curated list of models, benchmarks, tools and guides for audio editing
☆34Jul 7, 2026Updated 2 weeks ago
ETH-DISCO / blap
View on GitHub
Official repo for BLAP: Bootstrapping Language-Audio Pre-training for Music Captioning presented at ICASSP 2025
☆16Nov 18, 2024Updated last year
minguinho26 / Prefix_AAC_ICASSP2023
View on GitHub
Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"
☆30Dec 6, 2023Updated 2 years ago
microsoft / Pengi
View on GitHub
An Audio Language model for Audio Tasks
☆322Apr 19, 2024Updated 2 years ago
etzinis / optimal_condition_training
View on GitHub
Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris S…
☆14Feb 15, 2023Updated 3 years ago
jaeyeonkim99 / EnCLAP
View on GitHub
Official Implementation of EnCLAP (ICASSP 2024)
☆96Jun 2, 2024Updated 2 years ago
v-manhlt3 / m-LTM-Audio-Text-Retrieval
View on GitHub
☆13Jan 5, 2025Updated last year
jyhan03 / icassp22-dataset
View on GitHub
Dataset simulation for DPCCN.
☆16Dec 25, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
mulab-mir / muchomusic
View on GitHub
MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.
☆46Dec 3, 2024Updated last year
soham97 / MTL_Weakly_labelled_audio_data
View on GitHub
Code repo for "Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection"
☆17Nov 9, 2022Updated 3 years ago
xiquan-li / FineLAP
View on GitHub
[ACL 2026 Main] FineLAP: Taming Heterogeneous Supervision for Fine-grained Language-Audio Pre-training
☆36Apr 20, 2026Updated 3 months ago
lucieperrotta / computersandmusic
View on GitHub
Notebooks for the EPFL class "Computers and Music".
☆25Aug 20, 2021Updated 4 years ago
wdqqdw / Echo
View on GitHub
Project page of "2026-ICLR Echo: Towards Advanced Audio Comprehension via Audio-Interleaved Reasoning"
☆16Mar 26, 2026Updated 3 months ago
gchrupala / first-steps-ml
View on GitHub
First steps in Machine Learning
☆12Mar 18, 2015Updated 11 years ago
ckyang1124 / LALM-Evaluation-Survey
View on GitHub
Collection of works for evaluating (and analyzing) large audio-language models (LALMs)
☆41Aug 11, 2025Updated 11 months ago
lonzi / mrflow_dpo
View on GitHub
☆22Jan 3, 2026Updated 6 months ago
soham97 / sound_ai_progress
View on GitHub
Tracking states of the arts and recent results (bibliography) on sound tasks.
☆33Jan 10, 2023Updated 3 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
WangHelin1997 / SoloAudio
View on GitHub
SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.
☆119Jan 28, 2026Updated 5 months ago
zhaoyx239 / X-Translator
View on GitHub
☆22Updated this week
voidful / llm-codec
View on GitHub
LLM-Codec: Neural Audio Codec Meets Language Model Objectives
☆23May 3, 2026Updated 2 months ago
ASLP-lab / MSU-Bench
View on GitHub
Open repository of "MSU-Bench: Towards Understanding the Conversational Multi-Speaker Scenarios"
☆17Jul 7, 2026Updated 2 weeks ago
XinhaoMei / audio-text_retrieval
View on GitHub
Implementation of our paper 'On Metric Learning For Audio-Text Cross-Modal Retrieval'
☆51May 17, 2022Updated 4 years ago
gwh22 / LAFMA
View on GitHub
LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation (INTERSPEECH 2024)
☆44Jun 13, 2024Updated 2 years ago
PapayaResearch / ctag
View on GitHub
[ICML'24] Creative Text-to-Audio Generation via Synthesizer Programming
☆41Sep 26, 2024Updated last year