SAGNIKMJR/ego-AV-spatial-correspondence

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SAGNIKMJR/ego-AV-spatial-correspondence)

SAGNIKMJR / ego-AV-spatial-correspondence

[CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'

☆14

Alternatives and similar repositories for ego-AV-spatial-correspondence

Users that are interested in ego-AV-spatial-correspondence are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zcxu-eric / Ego4d_TalkNet_ASD
View on GitHub
☆21Feb 15, 2022Updated 4 years ago
Ego4DSounds / Ego4DSounds
View on GitHub
Ego4DSounds: A diverse egocentric dataset with high action-audio correspondence
☆21Jun 14, 2024Updated 2 years ago
jinbae-s / ACVIS
View on GitHub
[ICASSP 2026] The official pytorch implementation of ACVIS
☆15Jan 19, 2026Updated 6 months ago
ruohaoguo / ovavss
View on GitHub
Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].
☆37Nov 2, 2024Updated last year
kaiw7 / STG-CMA
View on GitHub
Towards Efficient Audio-Visual Learners via Empowering Pre-trained Vision Transformers with Cross-Modal Adaptation
☆15Apr 13, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
stoneMo / OneAVM
View on GitHub
Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)
☆12Jun 1, 2023Updated 3 years ago
AV-Reasoner / AV-Reasoner
View on GitHub
☆19Jul 22, 2025Updated last year
ruohaoguo / avis
View on GitHub
[CVPR 2025] 🔥 Official impl. of "Audio-Visual Instance Segmentation".
☆52Jun 5, 2025Updated last year
jasongief / Mettle
View on GitHub
[2025 TPAMI] Mettle: Meta-Token Learning for Memory-Efficient Audio-Visual Adaptation
☆18Jan 3, 2026Updated 6 months ago
dzh19990407 / LBDT
View on GitHub
CVPR2022 - Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
☆24Aug 12, 2022Updated 3 years ago
EGO4D / ego-exo4d-egopose
View on GitHub
☆18Apr 16, 2024Updated 2 years ago
BriansIDP / AudioVisualLLM
View on GitHub
☆19May 19, 2024Updated 2 years ago
facebookresearch / EgoCom-Dataset
View on GitHub
EgoCom: A Multi-person Multi-modal Egocentric Communications Dataset
☆63Nov 23, 2020Updated 5 years ago
OpenGVLab / EgoExoLearn
View on GitHub
[CVPR 2024] Data and benchmark code for the EgoExoLearn dataset
☆86Aug 26, 2025Updated 11 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
samuel-clarke / RealImpact
View on GitHub
☆34Apr 10, 2023Updated 3 years ago
anton-jeran / AV-RIR
View on GitHub
Audio-Visual Room Impulse Response Estimation
☆25Jul 22, 2024Updated 2 years ago
yingchengy / AVMOE
View on GitHub
[NeurIPS 2024] Mixture of Experts for Audio-Visual Learning
☆25Jan 19, 2025Updated last year
dedoogong / caffe-keypoint-rcnn
View on GitHub
Resnet-50 + FPN + Keypoint RCNN
☆14Jun 18, 2019Updated 7 years ago
zzhhfut / CCNet-AAAI2025
View on GitHub
This repository contains code for AAAI2025 paper "Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal …
☆24Aug 18, 2025Updated 11 months ago
stoneMo / CIGN
View on GitHub
Official implementation for CIGN
☆17Sep 11, 2023Updated 2 years ago
showlab / datacentric.vlp
View on GitHub
Compress conventional Vision-Language Pre-training data
☆52Sep 22, 2023Updated 2 years ago
see2sound / see2sound
View on GitHub
Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
☆141Mar 28, 2025Updated last year
ttgeng233 / UniAV
View on GitHub
Unified Audio-Visual Perception for Multi-Task Video Localization
☆33Apr 19, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
schowdhury671 / meerkat
View on GitHub
☆35Jul 9, 2025Updated last year
ut-vision / ActionVOS
View on GitHub
[ECCV 2024 Oral] ActionVOS: Actions as Prompts for Video Object Segmentation
☆32Dec 4, 2024Updated last year
khoiucd / escape-tgt
View on GitHub
Official Implementation for "ESCAPE: Encoding Super-keypoints for Category-Agnostic Pose Estimation", CVPR 2024.
☆10Jun 17, 2024Updated 2 years ago
liuhuadai / ViT-TTS
View on GitHub
PyTorch Implementation of ViT-TTS (EMNLP'23)
☆11Oct 20, 2023Updated 2 years ago
spyflying / LSCM-Refseg
View on GitHub
Code for Linguistic Structure Guided Context Modeling for Referring Image Segmentation, ECCV2020.
☆16Oct 2, 2020Updated 5 years ago
WikiChao / Ego-AV-Loc
View on GitHub
[CVPR 2023] Egocentric Audio-Visual Object Localization
☆27Jan 6, 2024Updated 2 years ago
hxixixh / mix-and-localize
View on GitHub
☆23Mar 20, 2024Updated 2 years ago
cyanbx / Frieren-V2A
View on GitHub
Implementation of Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching (NeurIPS'24)
☆63Apr 3, 2025Updated last year
xiaoneil / LPNet
View on GitHub
☆13Nov 28, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
facebookresearch / ego-env
View on GitHub
Human-centric environment representations from egocentric video
☆15Feb 5, 2026Updated 5 months ago
SJTUwxz / LoCoNet_ASD
View on GitHub
code repo for LoCoNet: Long-Short Context Network for Active Speaker Detection
☆57May 1, 2023Updated 3 years ago
YYX666660 / LAVSS
View on GitHub
Code for LAVSS: Location-Guided Audio-Visual Spatial Audio Separation
☆19Feb 25, 2025Updated last year
CASIA-IVA-Lab / MOSO
View on GitHub
☆35Jun 6, 2023Updated 3 years ago
Sid2697 / EgoProceL-egocentric-procedure-learning
View on GitHub
Code implementation for our ECCV, 2022 paper titled "My View is the Best View: Procedure Learning from Egocentric Videos"
☆35Feb 5, 2024Updated 2 years ago
mdx-tutorial / mdx-tutorial.github.io
View on GitHub
Tutorial covering Open Source tools for Source Separation.
☆15Nov 12, 2021Updated 4 years ago
karchkha / MelSpec_GPT_VQVAE
View on GitHub
Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms
☆18Oct 8, 2023Updated 2 years ago