itsnotacie / CVPR2023-EPIC-SOUNDS-Audio-Based-Interaction-Recognition-3rd-place-solution
☆30Updated last year
Related projects ⓘ
Alternatives and complementary repositories for CVPR2023-EPIC-SOUNDS-Audio-Based-Interaction-Recognition-3rd-place-solution
- itsnotacie / ICCV2023-OOD-CV-Challenge-Classification-Track-Self-supervised-pretrain-3rd-place-solution☆29Updated last year
- [ICLR 23 oral] The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation☆39Updated last year
- [CVPR 2024 Highlight] Official implementation of the paper: Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-…☆34Updated 3 months ago
- Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"☆26Updated last month
- Code for dmrnet☆16Updated 3 months ago
- [NeurIPS'24 Oral] HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning☆71Updated this week
- ☆38Updated last year
- ☆49Updated 2 weeks ago
- Official implementation for CIGN☆14Updated last year
- Multimodal Variational Auto-encoder based Audio-Visual Segmentation [ICCV2023].☆17Updated last month
- Multimodal Learning Method MLA for CVPR 2024☆56Updated 4 months ago
- ☆74Updated 4 months ago
- ☆10Updated 4 months ago
- Accepted at ICCV '23☆13Updated last year
- The repo for "Enhancing Multi-modal Cooperation via Sample-level Modality Valuation", CVPR 2024☆39Updated last week
- ☆18Updated last month
- ☆31Updated 8 months ago
- Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].☆14Updated last week
- A python implement for Certifiable Robust Multi-modal Training☆14Updated 3 months ago
- Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" proposed by Pekin…☆52Updated 3 weeks ago
- [EMNLP 2024 Oral] MatchTime: Towards Automatic Soccer Game Commentary Generation☆40Updated 3 weeks ago
- Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal …☆27Updated last week
- ☆130Updated 2 months ago
- Official PyTorch implementation of Which Tokens to Use? Investigating Token Reduction in Vision Transformers presented at ICCV 2023 NIVT …☆30Updated last year
- official repository for DiffCap: Exploring Continuous Diffusion on Image Captioning☆7Updated last year
- Code and Dataset for the paper "LAMM: Label Alignment for Multi-Modal Prompt Learning" AAAI 2024☆29Updated 10 months ago
- ☆21Updated 3 months ago
- The official implementation of "2024NeurIPS Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation"☆36Updated 3 weeks ago
- ☆9Updated 5 months ago
- Official implementation of "Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval (CVPR 2024 Highlight)"☆56Updated 3 months ago