zailongchen / Audio-Visual-Question-Answering-AVQALinks
This task is based on MUSIC-AVQA Dataset. And we focus on optimize the accuracy of AVQA task, which aims to answer questions regarding different visual objects, sounds, and their associations in videos. The problem requires comprehensive multimodal understanding and spatio-temporal reasoning over audio-visual scenes.
☆12Updated 2 years ago
Alternatives and similar repositories for Audio-Visual-Question-Answering-AVQA
Users that are interested in Audio-Visual-Question-Answering-AVQA are comparing it to the libraries listed below
Sorting:
- Bidirectional Visual-Textual Alignment for LLM-based Radiology Report Generation☆16Updated 5 months ago
- Enhancing Radiology Report Generation via Multi-Phased Supervision☆17Updated 5 months ago
- Optimizing Efficiency and Visual-Textual Alignment for LLM-Based Radiology Report Generation☆18Updated 5 months ago
- ☆62Updated 6 months ago
- Efficient Network Traffic Classification via Pre-training Unidirectional Mamba☆131Updated 4 months ago
- Research progress on speech deepfake detection: Relevant datasets aggregated from the review literature and publicly available codes☆226Updated 2 months ago
- GNN4ID: A Toolset for Crafting Graph Neural Network-Based NIDS Datasets☆17Updated 4 months ago
- Implementation of Robust Transformer Based Intrusion Detection, based on the Paper by Wu et. Al☆22Updated 11 months ago
- ASVspoof 2021 Baseline Systems☆232Updated last year
- [ICANN 2023] Anomaly-Based Insider Threat Detection via Hierarchical Information Fusion☆18Updated last year
- : An LLM Approach for Open-Set Encrypted Traffic Classification☆35Updated last month
- This is repository of the paper NetDiffus: Network Traffic Generation by Diffusion Models through Time-Series Imaging.☆21Updated 10 months ago
- Generate network packets using generative modeling☆13Updated 2 years ago
- A research project of anomaly detection on dataset IoT-23☆99Updated 11 months ago
- The repository of ET-BERT, a network traffic classification model on encrypted traffic. The work has been accepted as The Web Conference …☆507Updated 4 months ago
- Reproduction of paper Void: A Fast and Light Voice Liveness Detection System☆19Updated 4 years ago
- ☆86Updated last year
- This is a Python version of CICFlowmeter-V4.0 (formerly known as ISCXFlowMeter) - an Ethernet traffic Bi-flow generator and analyzer for …☆74Updated 4 years ago
- DeepFake Audio detection using MFCC☆32Updated last year
- Analysis of the ISCX VPN-nonVPN Dataset 2016 for Encrypted Network Traffic Classification☆89Updated last year
- A curation of awesome papers, datasets and tools about network traffic analysis.☆80Updated 9 months ago
- LiPar: A Lightweight Parallel Learning Model for Practical In-Vehicle Network Intrusion Detection (arXiv:2311.08000v2)☆17Updated 6 months ago
- ☆61Updated 8 months ago
- ☆15Updated 2 years ago
- Materials about Encrypted Traffic Analysis☆197Updated 2 weeks ago
- Code for the AAAI'23 paper "Yet Another Traffic Classifier: A Masked Autoencoder Based Traffic Transformer with Multi-Level Flow Represen…☆107Updated last year
- Toolkit for processing PCAP file and transform into image of MNIST dataset☆242Updated last year
- Traffic dataset USTC-TFC2016☆145Updated 6 years ago
- ☆19Updated 4 years ago
- An automatic packet crafting tool for evading learning-based NIDS☆82Updated 3 years ago