xid32 / SoundMindLinks

We introduce the Audio Logical Reasoning (ALR) dataset, consisting of 6,446 text-audio annotated samples specifically designed for complex reasoning tasks. Building on this resource, we propose SoundMind, a rule-based reinforcement learning (RL) algorithm tailored to endow audio language models (ALMs) with deep bimodal reasoning abilities.
126Updated this week

Alternatives and similar repositories for SoundMind

Users that are interested in SoundMind are comparing it to the libraries listed below

Sorting: