This repository contains code for AAAI2025 paper "Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration"
☆23Aug 18, 2025Updated 7 months ago
Alternatives and similar repositories for CCNet-AAAI2025
Users that are interested in CCNet-AAAI2025 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [2025 CVPR] Towards Open-Vocabulary Audio-Visual Event Localization☆42Mar 7, 2025Updated last year
- [2024 ECCV] Label-anticipated Event Disentanglement for Audio-Visual Video Parsing☆14Nov 17, 2024Updated last year
- ☆13Feb 26, 2024Updated 2 years ago
- ☆36Jul 9, 2025Updated 8 months ago
- \infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation☆19Feb 14, 2025Updated last year
- Towards Long Form Audio-visual Video Understanding☆15Jan 16, 2026Updated 2 months ago
- Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)☆72Jan 4, 2026Updated 2 months ago
- Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering [ACM MM'24]☆10Jul 22, 2024Updated last year
- [CVPR 2025] 🔥 Official impl. of "Audio-Visual Instance Segmentation".☆45Jun 5, 2025Updated 9 months ago
- [CVPR 2025] Official implementation of the paper "DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for T…☆25Jul 1, 2025Updated 8 months ago
- [EMNLP’24 Main] Encoding and Controlling Global Semantics for Long-form Video Question Answering☆18Oct 9, 2024Updated last year
- MUSIC-AVQA, CVPR2022 (ORAL)☆98Dec 30, 2022Updated 3 years ago
- [CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'☆13Jun 16, 2024Updated last year
- [EMNLP 2024] A Video Chat Agent with Temporal Prior