AlyssaYoung/AVQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AlyssaYoung/AVQA)

AlyssaYoung / AVQA

ACM MM 2022 paper_AVQA: A Dataset for Audio-Visual Question Answering on Videos

☆15

Alternatives and similar repositories for AVQA

Users that are interested in AVQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

GeWu-Lab / MUSIC-AVQA
View on GitHub
MUSIC-AVQA, CVPR2022 (ORAL)
☆100Dec 30, 2022Updated 3 years ago
mira-ai-lab / MUSIC-AVQA-R
View on GitHub
☆13May 21, 2024Updated 2 years ago
GeWu-Lab / PSTP-Net
View on GitHub
☆17Aug 11, 2023Updated 2 years ago
kuan2jiu99 / audio-hallucination
View on GitHub
Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024
☆34Mar 14, 2025Updated last year
yifanfeng97 / OS-MN40-Example
View on GitHub
☆15Jan 20, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
kaist-ami / AVHBench
View on GitHub
[ICLR'25] Official repository for "AVHBench: A Cross-Modal Hallucination Evaluation for Audio-Visual Large Language Models"
☆25Mar 8, 2026Updated 4 months ago
bmcfee / ccrma2018_notebooks
View on GitHub
Extra notebooks for CCRMA MIR workshop, 2018 edition
☆13Jun 28, 2018Updated 8 years ago
schowdhury671 / meerkat
View on GitHub
☆35Jul 9, 2025Updated last year
wang22ti / OpenAUC
View on GitHub
☆14Dec 23, 2023Updated 2 years ago
aim-uofa / ReasonMatch
View on GitHub
[CVPR2026] Eliciting Complex Spatial Reasoning in MLLMs through Wide-Baseline Matching
☆19Jun 4, 2026Updated last month
look4u-ok / video-slicer
View on GitHub
☆18Jun 18, 2024Updated 2 years ago
NJU-LINK / MVU-Eval
View on GitHub
MVU-Eval @NeurIPS DB 2025
☆18Nov 11, 2025Updated 8 months ago
GeWu-Lab / Crab
View on GitHub
[CVPR 2025] Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
☆85Dec 24, 2025Updated 7 months ago
swarupbehera / awesome-audio-visual-question-answering
View on GitHub
A curated list of resources in audio visual question answering and related area. :-)
☆17Jun 29, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
epic-kitchens / VISOR-VIS
View on GitHub
Visualisation of VISOR Segmentations with Annotations and Relations
☆22Aug 15, 2022Updated 3 years ago
fyyCS / LSLD
View on GitHub
☆14Nov 13, 2023Updated 2 years ago
csycdong / SJDD-Net
View on GitHub
☆11Jun 25, 2024Updated 2 years ago
distributed-information-bottleneck / distributed-information-bottleneck.github.io
View on GitHub
A repository for using the distributed information bottleneck to locate information in data
☆17Aug 26, 2024Updated last year
GeWu-Lab / LFAV
View on GitHub
Towards Long Form Audio-visual Video Understanding
☆15Jan 16, 2026Updated 6 months ago
Vinsonzyh / BlueSuffix
View on GitHub
[ICLR 2025] BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
☆31Nov 2, 2025Updated 8 months ago
zjlww / papers
View on GitHub
Connected Papers knockoff, managing academic papers and citations with graph database.
☆12Dec 26, 2023Updated 2 years ago
cuichenrui2000 / barry_speech_tools
View on GitHub
This repository documents Barry's journey in learning deep learning for speech processing. Here, you'll find scripts and code snippets re…
☆13Oct 8, 2025Updated 9 months ago
HuangZikang-TJU / Aug4TSE
View on GitHub
☆15Sep 16, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
stoneMo / MGN
View on GitHub
Official implementation for MGN
☆20Dec 22, 2022Updated 3 years ago
facebookresearch / daqa
View on GitHub
Temporal Reasoning via Audio Question Answering
☆27Dec 21, 2019Updated 6 years ago
dingyue772 / OmniSIFT
View on GitHub
[ICML2026] OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models
☆25May 21, 2026Updated 2 months ago
vscomputer / chuck-examples
View on GitHub
Example code to help people follow along with the tutorials
☆25Aug 21, 2024Updated last year
aim-uofa / TVRBench
View on GitHub
TVRBench: Target Viewpoint Reproduction Benchmark for Active Spatial Intelligence
☆25Jun 2, 2026Updated last month
xiaoxue1117 / speech-mamba-public
View on GitHub
☆15Nov 26, 2024Updated last year
adxcreative / D-M
View on GitHub
The official source code of our AAAI25 paper "D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matchin…
☆10Feb 9, 2025Updated last year
OpenGVLab / MUTR
View on GitHub
「AAAI 2024」 Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation
☆85Jun 13, 2025Updated last year
aim-uofa / GSI-Bench
View on GitHub
[CVPR2026] Exploring Spatial Intelligence from a Generative Perspective
☆30Jun 3, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Franklin905 / VALOR
View on GitHub
Research code for NeurIPS 2023 paper "Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser"
☆17Jul 13, 2025Updated last year
hongsunjang / pipe-bd
View on GitHub
[DATE 2023] Pipe-BD: Pipelined Parallel Blockwise Distillation
☆12Jul 13, 2023Updated 3 years ago
Audio-Reasoning-Challenge / Audio-Reasoning-Challenge-Baselines
View on GitHub
The baselines of ARC-Challenge-Interspeech2026
☆60Dec 1, 2025Updated 7 months ago
AnuoF / asr_example_csharp
View on GitHub
封装了百度、捷通华声和讯飞语音识别的库，以及捷通华声、民族语文翻译、小牛翻译的封装。
☆15Sep 10, 2019Updated 6 years ago
khfs / DuplexMamba
View on GitHub
☆18Mar 6, 2026Updated 4 months ago
tuyunbin / SRDRL
View on GitHub
[ACL 2021] This is the Pytorch code for our paper "Semantic Relation-aware Difference Representation Learning for Change Captioning".
☆13Jan 16, 2022Updated 4 years ago
beasteers / singuconda
View on GitHub
go binary for setting up singularity containers with a miniconda
☆21Feb 3, 2026Updated 5 months ago