zailongchen/Audio-Visual-Question-Answering-AVQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zailongchen/Audio-Visual-Question-Answering-AVQA)

zailongchen / Audio-Visual-Question-Answering-AVQA

This task is based on MUSIC-AVQA Dataset. And we focus on optimize the accuracy of AVQA task, which aims to answer questions regarding different visual objects, sounds, and their associations in videos. The problem requires comprehensive multimodal understanding and spatio-temporal reasoning over audio-visual scenes.

☆13

Alternatives and similar repositories for Audio-Visual-Question-Answering-AVQA

Users that are interested in Audio-Visual-Question-Answering-AVQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zailongchen / R2GenAlign
View on GitHub
Analyzing and Enhancing Visual Learning in LLM-based Radiology Report Generation
☆17Feb 23, 2026Updated 4 months ago
zailongchen / MultiP-R2Gen
View on GitHub
Enhancing Radiology Report Generation via Multi-Phased Supervision
☆25Mar 6, 2025Updated last year
zailongchen / R2Gen-EVA
View on GitHub
Optimizing Efficiency and Visual-Textual Alignment for LLM-Based Radiology Report Generation
☆19Mar 5, 2025Updated last year
cc200041 / Laco
View on GitHub
Official implementation of "LaCo: Layer-wise Compensation for Pruned Large Language Models" (ACL 2026).
☆152May 20, 2026Updated last month
FishMaster93 / AFFIA3K
View on GitHub
☆10Apr 12, 2023Updated 3 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
huanyushi / huanyushi.github.io
View on GitHub
My blog based on the Jekyll theme Chirpy
☆20Jun 30, 2026Updated last week
JC-Shi / Learned-Index-Benefits
View on GitHub
☆21Mar 2, 2022Updated 4 years ago
cornchz / Bron-Kerbosch
View on GitHub
Performance comparison of three Bron–Kerbosch algorithm implementations that find all maximal cliques in a graph.
☆25May 12, 2014Updated 12 years ago
uwdb / VOCAL-UDF
View on GitHub
VOCAL-UDF: Self-Enhancing Video Data Management System for Compositional Events with Large Language Models
☆12Dec 12, 2025Updated 6 months ago
kaletap / bfs-cuda-gpu
View on GitHub
Implementation of parallel Breadth First Algorithm for graph traversal using CUDA and C++ language.
☆35Dec 12, 2019Updated 6 years ago
mongodb-developer / Google-Cloud-Generative-AI-Chatbot
View on GitHub
Generative AI Customer Service Chatbot with MongoDB Atlas and Google Cloud Vertex AI PaLM API
☆16Dec 11, 2023Updated 2 years ago
murnanedaniel / Dynamic-Loss-Weighting
View on GitHub
A small collection of tools to manage deep learning with multiple sources of loss
☆18May 6, 2025Updated last year
liangdaojun / Denseformer
View on GitHub
Intrusion Detection System, IDS，Cyberattack Detection，Pytorch，Transformer
☆11Oct 17, 2022Updated 3 years ago
andijakl / nfcinteractor
View on GitHub
View low level information about NFC tags and their contents, and write your own tags with a dynamic NDEF message editor UI. Qt version f…
☆22Jul 22, 2013Updated 12 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
hqsiswiliam / persona-adaptive-attention
View on GitHub
☆26Oct 13, 2023Updated 2 years ago
archit-p / NLP-Malware
View on GitHub
Network-Based Malware Detection using Natural Language Processing
☆14May 10, 2021Updated 5 years ago
DataManagementLab / zero-shot-cost-estimation
View on GitHub
Implementation of our VLDB'22 paper "Zero-Shot Cost Models for Out-of-the-box Learned Cost Prediction"
☆55Nov 11, 2022Updated 3 years ago
downgoon / video-motion-detection
View on GitHub
视频AI科普教程——视频运动检测
☆17Oct 13, 2020Updated 5 years ago
Yasir-ali-farrukh / GNN4ID
View on GitHub
GNN4ID: A Toolset for Crafting Graph Neural Network-Based NIDS Datasets
☆32Feb 23, 2026Updated 4 months ago
Nidhi08 / GANs-for-imbalanced-data-generation
View on GitHub
☆17Oct 30, 2018Updated 7 years ago
etzinis / heterogeneous_separation
View on GitHub
Code and data recipes for the paper: Heterogeneous Target Speech Separation
☆44Dec 6, 2022Updated 3 years ago
sandialabs / packet2vec
View on GitHub
Word2Vec embeddings over packet capture data n-grams.
☆20Mar 24, 2023Updated 3 years ago
PoCInnovation / Sharkticon
View on GitHub
Sharkticon is an anomaly detection system, it analyzes your network using a Transformers model adapted to the anomaly detection.
☆23May 19, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
microsoft / WavText5K
View on GitHub
Web-crawl for "Audio Retrieval with WavText5K and CLAP Training"
☆50Nov 10, 2022Updated 3 years ago
haoheliu / diffres-python
View on GitHub
Learning differentiable temporal resolution on time-series data.
☆36Nov 12, 2022Updated 3 years ago
kyuyeonpooh / objects-that-sound
View on GitHub
The unofficial implementation of paper, "Objects that Sound", from ECCV 2018.
☆31Jan 29, 2024Updated 2 years ago
c2dc / fl-unsup-nids
View on GitHub
☆23Oct 22, 2024Updated last year
virtuald / r-star-tree
View on GitHub
A relatively simple implementation of the R* Tree data structure for C++
☆51Jan 10, 2023Updated 3 years ago
rafalk342 / bfs-cuda
View on GitHub
Implementation of breadth first search on GPU with CUDA Driver API.
☆55Apr 7, 2021Updated 5 years ago
qibinlou / FacePlusPlus-Stars-Library-Images-Crawler
View on GitHub
Face++ starlib 明星库头像标注集爬虫及图片集合，用于face recognition training
☆25Sep 29, 2018Updated 7 years ago
wangkai-tech23 / LiPar
View on GitHub
LiPar: A Lightweight Parallel Learning Model for Practical In-Vehicle Network Intrusion Detection (arXiv:2311.08000v2)
☆26Nov 22, 2025Updated 7 months ago
redBu1l / ZVulDrill
View on GitHub
ZVulDrill靶场二次开发，增加了一些常见PHP漏洞，一直在更新。
☆31Jun 9, 2017Updated 9 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
nfc-tools / libndef
View on GitHub
Qt library to encode/decode NDEF (NFC Data Exchange Format) messages
☆32Sep 28, 2020Updated 5 years ago
yangyangxu0 / MQTransformer
View on GitHub
☆25Jun 29, 2023Updated 3 years ago
greatji / Learning-based-cost-estimator
View on GitHub
☆62Jun 17, 2021Updated 5 years ago
vvsotnikov / LSTM-IDS
View on GitHub
Network data classifier based on the recurrent neural network.
☆20Apr 3, 2019Updated 7 years ago
mandersch / RTIDS
View on GitHub
Implementation of Robust Transformer Based Intrusion Detection, based on the Paper by Wu et. Al
☆29Sep 10, 2024Updated last year
pandazheng / MiraiSecurity
View on GitHub
Mirai
☆42Oct 19, 2021Updated 4 years ago
jogecodes / transformerAD
View on GitHub
Code for the paper "Anomaly-Based Intrusion Detection in IIoT Networks Using Transformer Models"
☆38Mar 3, 2023Updated 3 years ago