zailongchen / Audio-Visual-Question-Answering-AVQAView external linksLinks
This task is based on MUSIC-AVQA Dataset. And we focus on optimize the accuracy of AVQA task, which aims to answer questions regarding different visual objects, sounds, and their associations in videos. The problem requires comprehensive multimodal understanding and spatio-temporal reasoning over audio-visual scenes.
☆13Feb 11, 2023Updated 3 years ago
Alternatives and similar repositories for Audio-Visual-Question-Answering-AVQA
Users that are interested in Audio-Visual-Question-Answering-AVQA are comparing it to the libraries listed below
Sorting:
- Bidirectional Visual-Textual Alignment for LLM-based Radiology Report Generation☆17Mar 5, 2025Updated 11 months ago
- Enhancing Radiology Report Generation via Multi-Phased Supervision☆24Mar 6, 2025Updated 11 months ago
- Optimizing Efficiency and Visual-Textual Alignment for LLM-Based Radiology Report Generation☆19Mar 5, 2025Updated 11 months ago
- ☆10Apr 12, 2023Updated 2 years ago
- VOCAL-UDF: Self-Enhancing Video Data Management System for Compositional Events with Large Language Models☆12Dec 12, 2025Updated 2 months ago
- Intrusion Detection System, IDS,Cyberattack Detection,Pytorch,Transformer☆11Oct 17, 2022Updated 3 years ago
- My blog based on the Jekyll theme Chirpy☆16May 21, 2025Updated 8 months ago
- Generative AI Customer Service Chatbot with MongoDB Atlas and Google Cloud Vertex AI PaLM API☆16Dec 11, 2023Updated 2 years ago
- Network-Based Malware Detection using Natural Language Processing☆14May 10, 2021Updated 4 years ago
- A small collection of tools to manage deep learning with multiple sources of loss☆17May 6, 2025Updated 9 months ago
- GNN4ID: A Toolset for Crafting Graph Neural Network-Based NIDS Datasets☆27Apr 4, 2025Updated 10 months ago
- 视频AI科普教程——视频运动检测☆17Oct 13, 2020Updated 5 years ago
- ☆21Mar 2, 2022Updated 3 years ago
- LiPar: A Lightweight Parallel Learning Model for Practical In-Vehicle Network Intrusion Detection (arXiv:2311.08000v2)☆24Nov 22, 2025Updated 2 months ago
- Sharkticon is an anomaly detection system, it analyzes your network using a Transformers model adapted to the anomaly detection.☆23May 19, 2023Updated 2 years ago
- ☆17Oct 30, 2018Updated 7 years ago
- View low level information about NFC tags and their contents, and write your own tags with a dynamic NDEF message editor UI. Qt version f…☆22Jul 22, 2013Updated 12 years ago
- ☆22Oct 22, 2024Updated last year
- Word2Vec embeddings over packet capture data n-grams.☆20Mar 24, 2023Updated 2 years ago
- Implementation of Robust Transformer Based Intrusion Detection, based on the Paper by Wu et. Al☆27Sep 10, 2024Updated last year
- Network data classifier based on the recurrent neural network.☆20Apr 3, 2019Updated 6 years ago
- Face++ starlib 明星库头像标注集爬虫及图片集合,用于face recognition training☆25Sep 29, 2018Updated 7 years ago
- ☆25Jun 29, 2023Updated 2 years ago
- ☆26Oct 13, 2023Updated 2 years ago
- Qt library to encode/decode NDEF (NFC Data Exchange Format) messages☆32Sep 28, 2020Updated 5 years ago
- Performance comparison of three Bron–Kerbosch algorithm implementations that find all maximal cliques in a graph.☆25May 12, 2014Updated 11 years ago
- ZVulDrill靶场二次开发,增加了一些常见PHP漏洞,一直在更新。☆32Jun 9, 2017Updated 8 years ago
- The unofficial implementation of paper, "Objects that Sound", from ECCV 2018.☆31Jan 29, 2024Updated 2 years ago
- Code for the paper "Anomaly-Based Intrusion Detection in IIoT Networks Using Transformer Models"☆35Mar 3, 2023Updated 2 years ago
- This program allow you to extract some features from pcap files.☆40Apr 4, 2023Updated 2 years ago
- Implementation of parallel Breadth First Algorithm for graph traversal using CUDA and C++ language.☆34Dec 12, 2019Updated 6 years ago
- Code and data recipes for the paper: Heterogeneous Target Speech Separation☆43Dec 6, 2022Updated 3 years ago
- Learning differentiable temporal resolution on time-series data.☆36Nov 12, 2022Updated 3 years ago
- Payload-Byte is a tool for extracting and labeling packet capture (Pcap) files of modern network intrusion detection datasets.☆48Jul 12, 2024Updated last year
- Web-crawl for "Audio Retrieval with WavText5K and CLAP Training"☆50Nov 10, 2022Updated 3 years ago
- Implementation of our VLDB'22 paper "Zero-Shot Cost Models for Out-of-the-box Learned Cost Prediction"☆54Nov 11, 2022Updated 3 years ago
- Deepfake detection with deep learning.☆42Apr 24, 2023Updated 2 years ago
- KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques☆51Dec 9, 2024Updated last year
- Mirai☆42Oct 19, 2021Updated 4 years ago