[CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'
☆13Jun 16, 2024Updated last year
Alternatives and similar repositories for ego-AV-spatial-correspondence
Users that are interested in ego-AV-spatial-correspondence are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆21Feb 15, 2022Updated 4 years ago
- Ego4DSounds: A diverse egocentric dataset with high action-audio correspondence☆20Jun 14, 2024Updated last year
- Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)☆12Jun 1, 2023Updated 2 years ago
- ☆33Apr 10, 2023Updated 2 years ago
- ☆19Jul 22, 2025Updated 8 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- CVPR2022 - Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation☆24Aug 12, 2022Updated 3 years ago
- [CVPR 2024] Data and benchmark code for the EgoExoLearn dataset☆81Aug 26, 2025Updated 7 months ago
- ☆19May 19, 2024Updated last year
- code repo for LoCoNet: Long-Short Context Network for Active Speaker Detection☆52May 1, 2023Updated 2 years ago
- EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset☆60Nov 23, 2020Updated 5 years ago
- Resnet-50 + FPN + Keypoint RCNN☆14Jun 18, 2019Updated 6 years ago
- Audio-Visual Room Impulse Response Estimation☆24Jul 22, 2024Updated last year
- Official implementation for CIGN☆17Sep 11, 2023Updated 2 years ago
- Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].☆35Nov 2, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Panoramic Out-of-Distribution Segmentation☆15Dec 21, 2025Updated 3 months ago
- Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound☆139Mar 28, 2025Updated last year
- Compress conventional Vision-Language Pre-training data☆53Sep 22, 2023Updated 2 years ago
- ☆36Jul 9, 2025Updated 8 months ago
- [ECCV 2024 Oral] ActionVOS: Actions as Prompts for Video Object Segmentation☆31Dec 4, 2024Updated last year
- PyTorch Implementation of ViT-TTS (EMNLP'23)☆11Oct 20, 2023Updated 2 years ago
- Official Implementation for "ESCAPE: Encoding Super-keypoints for Category-Agnostic Pose Estimation", CVPR 2024.☆10Jun 17, 2024Updated last year
- Code for Linguistic Structure Guided Context Modeling for Referring Image Segmentation, ECCV2020.☆16Oct 2, 2020Updated 5 years ago
- [CVPR 2023] Egocentric Audio-Visual Object Localization☆27Jan 6, 2024Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆22Mar 20, 2024Updated 2 years ago
- Code for LAVSS: Location-Guided Audio-Visual Spatial Audio Separation☆18Feb 25, 2025Updated last year
- ☆13Nov 28, 2021Updated 4 years ago
- Human-centric environment representations from egocentric video☆14Feb 5, 2026Updated last month
- This repository contains code for AAAI2025 paper "Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal …☆23Aug 18, 2025Updated 7 months ago
- Official implement of CVPR2025 paper: "T2ICount: Enhancing Cross-modal Understanding for zero-shot Counting"☆25Apr 9, 2025Updated 11 months ago
- Source code for the Paper "Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models"☆19Feb 1, 2026Updated last month
- Code implementation for our ECCV, 2022 paper titled "My View is the Best View: Procedure Learning from Egocentric Videos"☆34Feb 5, 2024Updated 2 years ago
- ☆35Jun 6, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- [NeurIPS 2024 Spotlight] code for "Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement"☆19Jan 26, 2025Updated last year
- Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms☆18Oct 8, 2023Updated 2 years ago
- Tutorial covering Open Source tools for Source Separation.☆15Nov 12, 2021Updated 4 years ago
- The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmente…☆134Dec 4, 2023Updated 2 years ago
- [ICLR2023] Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation (CDCD).☆162Apr 5, 2023Updated 2 years ago
- [ECCV2024] The official implementation of "Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation".☆13Feb 24, 2025Updated last year
- Unified Audio-Visual Perception for Multi-Task Video Localization☆31Apr 19, 2024Updated last year