schowdhury671 / meerkat
☆18Updated last month
Related projects ⓘ
Alternatives and complementary repositories for meerkat
- [ECCV’24] Official Implementation for CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenario…☆39Updated 2 months ago
- Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)☆58Updated 9 months ago
- Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.☆14Updated 2 weeks ago
- Research code for NeurIPS 2023 paper "Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser"☆15Updated last year
- Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"☆26Updated last month
- ☆21Updated last month
- Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"☆37Updated this week
- ☆13Updated 11 months ago
- [CVPR 2024 Highlight] Official implementation of the paper: Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-…☆34Updated 3 months ago
- Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)☆48Updated last year
- ☆10Updated 4 months ago
- Vision Transformers are Parameter-Efficient Audio-Visual Learners☆85Updated last year
- NeurIPS'2023 official implementation code☆56Updated last year
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆40Updated 4 months ago
- [CVPR 2024] Official PyTorch implementation of the paper "One For All: Video Conversation is Feasible Without Video Instruction Tuning"☆26Updated 9 months ago
- [CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval☆44Updated 4 months ago
- MUSIC-AVQA, CVPR2022 (ORAL)☆67Updated last year
- Official implementation of "Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval (CVPR 2024 Highlight)"☆56Updated 3 months ago
- Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex and Professional Sports☆28Updated 10 months ago
- [CVPR'23 Highlight] AutoAD: Movie Description in Context.☆87Updated this week
- Official Implementation of SnAG (CVPR 2024)☆35Updated 2 weeks ago
- Official Implementation of "The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval"☆46Updated last week
- Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Gr…☆116Updated 2 months ago
- [Preprint] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding☆65Updated last month
- Unified Audio-Visual Perception for Multi-Task Video Localization☆21Updated 6 months ago
- Official repository of "Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer", AAAI 2024☆15Updated 7 months ago
- Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)☆61Updated 5 months ago
- This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and …☆34Updated last year
- [CVPR 2024] Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection☆73Updated 3 months ago
- ☆35Updated 7 months ago