Sreyan88/CompA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Sreyan88/CompA)

Sreyan88 / CompA

Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models

☆23

Alternatives and similar repositories for CompA

Users that are interested in CompA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

microsoft / AudioEntailment
View on GitHub
Audio Entailment: Deductive Reasoning for Audio Understanding
☆17Dec 10, 2024Updated last year
Sreyan88 / ReCLAP
View on GitHub
☆33Dec 23, 2025Updated 6 months ago
mulab-mir / muchomusic
View on GitHub
MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.
☆46Dec 3, 2024Updated last year
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
kyegomez / AudioFlamingo
View on GitHub
Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dial…
☆39Jan 27, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
gzhu06 / Cacophony
View on GitHub
Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986
☆49Jan 19, 2026Updated 6 months ago
aim-qmul / sdx23-aimless
View on GitHub
Source Separation training codebase for the Sound Demixing Challenge 2023.
☆45May 18, 2023Updated 3 years ago
TiffanyBlews / MozartsTouch
View on GitHub
Official implementation of Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models
☆43Mar 17, 2026Updated 4 months ago
h-munakata / Lighthouse-Wrapper-for-Audio-Moment-Retrieval
View on GitHub
☆13Mar 23, 2026Updated 3 months ago
zhaojw1998 / DAT-CVAE
View on GitHub
Codes and MIDI demos of ISMIR 2022 paper: Domain Adversarial Training on Conditional Variational Auto-Encoder for Controllable Music Gene…
☆21Mar 28, 2023Updated 3 years ago
Labbeti / aac-metrics
View on GitHub
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
☆75Mar 22, 2026Updated 3 months ago
kwatcharasupat / musdb25
View on GitHub
MUSDB25 - A Fully Multitrack Dataset for Music Source Separation
☆13Mar 29, 2025Updated last year
nttcslab / m2d
View on GitHub
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
☆161Feb 23, 2026Updated 4 months ago
Audio-AGI / FlowSep
View on GitHub
Official implementation for FlowSep
☆77Jan 2, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sony / soundctm
View on GitHub
Pytorch implementation of SoundCTM
☆101Mar 31, 2025Updated last year
yongyizang / AreYouReallyListening
View on GitHub
Official Repository for ISMIR 2025 paper "Are you really listening? Boosting Perceptual Awareness in Music-QA Benchmarks"
☆20Aug 18, 2025Updated 11 months ago
slSeanWU / beats-conformer-bart-audio-captioner
View on GitHub
PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…
☆41Jan 6, 2024Updated 2 years ago
HanxunH / AudioMosaic
View on GitHub
[ICML2026] AudioMosaic: Contrastive Masked Audio Representation Learning
☆22May 15, 2026Updated 2 months ago
kuan2jiu99 / audio-hallucination
View on GitHub
Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024
☆34Mar 14, 2025Updated last year
mcomunita / tonetwist-afx-dataset
View on GitHub
Dataset of dry/wet pairs for audio effects research
☆43Apr 17, 2025Updated last year
CPJKU / audio_sheet_retrieval
View on GitHub
Learning Audio–Sheet Music Correspondences for Cross-Modal Retrieval and Piece Identification
☆25Jun 21, 2022Updated 4 years ago
kyegomez / Audio-xLSTMs
View on GitHub
Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch
☆19Jul 13, 2026Updated last week
kjw11 / CSEnet-ASR
View on GitHub
Cross-Speaker Encoding Network for Multi-talker Speech Recognition
☆12Mar 14, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
chaufanglin / Normal2Whisper
View on GitHub
Implementation of "Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation"
☆14Oct 31, 2024Updated last year
Hannieliao / Baton
View on GitHub
Official Repository of IJCAI 2024 Paper: "BATON: Aligning Text-to-Audio Model with Human Preference Feedback"
☆32Mar 4, 2025Updated last year
yongyizang / BachDuet-WebGUI
View on GitHub
A Web Application for Baroque-style Human/Computer Musical Jamming.
☆15May 31, 2023Updated 3 years ago
qiuqiangkong / music_llm
View on GitHub
☆56Jul 13, 2025Updated last year
Sreyan88 / GAMA
View on GitHub
Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities
☆153Dec 5, 2024Updated last year
SonyCSLParis / audio-representations
View on GitHub
JEPAs for audio representation learning
☆26Jun 11, 2026Updated last month
yoyolicoris / music-spectrogram-diffusion-pytorch
View on GitHub
☆88Jan 29, 2023Updated 3 years ago
yongyizang / OpenGufeng
View on GitHub
An Open-source Gufeng Melody and Chord Dataset
☆15May 10, 2023Updated 3 years ago
sildater / thegluenote
View on GitHub
TheGlueNote is representation model for note-wise music alignment.
☆14Jul 19, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Audio-AGI / dcase2024_task9_baseline
View on GitHub
Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"
☆26Mar 27, 2024Updated 2 years ago
GLJS / audio-datasets
View on GitHub
GitHub Repository for the Survey Paper on Audio-Language Datasets for Scenes and Events
☆17Feb 7, 2025Updated last year
Splend1d / T5lephone
View on GitHub
Code for T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5
☆19Nov 29, 2022Updated 3 years ago
HilaManor / AudioEditingCode
View on GitHub
☆194Nov 19, 2025Updated 8 months ago
Kikyo-16 / airgen
View on GitHub
Official source codes of airsep
☆39Mar 26, 2024Updated 2 years ago
carey-bunks / Jazz-Chord-Progressions-Corpus
View on GitHub
Jazz chord progression corpus and code for evaluating harmonic similarity
☆18Oct 20, 2023Updated 2 years ago
salgado / music-search
View on GitHub
Code from blog 'Searching by Music: Leveraging Vector Search for Music Information Retrieval'
☆16Nov 16, 2023Updated 2 years ago