hxixixh/mix-and-localize

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hxixixh/mix-and-localize)

hxixixh / mix-and-localize

☆23

Alternatives and similar repositories for mix-and-localize

Users that are interested in mix-and-localize are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

stoneMo / SLAVC
View on GitHub
Official Codebase of "A Closer Look at Weakly-Supervised Audio-Visual Source Localization" (NeurIPS 2022)
☆22Dec 6, 2022Updated 3 years ago
OpenNLPLab / FNAC_AVL
View on GitHub
[CVPR 2023] Official implementation of our paper - Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learnin…
☆30Apr 10, 2023Updated 3 years ago
stoneMo / OneAVM
View on GitHub
Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)
☆12Jun 1, 2023Updated 3 years ago
zjsong / SSPL
View on GitHub
PyTorch code for "Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes" (CVPR, 2022…
☆32Jul 8, 2024Updated 2 years ago
stoneMo / EZ-VSL
View on GitHub
Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)
☆42Oct 2, 2022Updated 3 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
stoneMo / AVGN
View on GitHub
Official implementation for AVGN
☆42Mar 24, 2023Updated 3 years ago
jinxiang-liu / anno-free-AVS
View on GitHub
Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"
☆38Oct 11, 2024Updated last year
OpenNLPLab / MMVAE-AVS
View on GitHub
Multimodal Variational Auto-encoder based Audio-Visual Segmentation [ICCV2023].
☆20Sep 19, 2024Updated last year
jczhang02 / MUSIC_dataset_script
View on GitHub
This repo contains script to download MUSIC dataset from youtube
☆12Jan 19, 2024Updated 2 years ago
stoneMo / CIGN
View on GitHub
Official implementation for CIGN
☆17Sep 11, 2023Updated 2 years ago
vvvb-github / AVSegFormer
View on GitHub
[AAAI 2024] AVSegFormer: Audio-Visual Segmentation with Transformer
☆74Mar 6, 2025Updated last year
GeWu-Lab / Generalizable-Audio-Visual-Segmentation
View on GitHub
Official repository of "Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer", AAAI 2024
☆28Mar 14, 2026Updated 4 months ago
swimmiing / ACL-SSL
View on GitHub
Repository of the IJCV'26 & WACV'24 paper
☆35Apr 27, 2026Updated 3 months ago
zihuixue / seeAoT
View on GitHub
Code and data release for the paper "Seeing the Arrow of Time in Large Multimodal Models"
☆16Oct 2, 2025Updated 9 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
WikiChao / Ego-AV-Loc
View on GitHub
[CVPR 2023] Egocentric Audio-Visual Object Localization
☆27Jan 6, 2024Updated 2 years ago
GenjiB / LAVISH
View on GitHub
Vision Transformers are Parameter-Efficient Audio-Visual Learners
☆107Aug 11, 2023Updated 2 years ago
IFICL / SLfM
View on GitHub
Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
☆43Jul 16, 2026Updated last week
SAGNIKMJR / ego-AV-spatial-correspondence
View on GitHub
[CVPR 2024] Code and datasets for 'Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos'
☆14Jun 16, 2024Updated 2 years ago
kaistmm / SSLalignment
View on GitHub
☆38May 28, 2025Updated last year
hche11 / Localizing-Visual-Sounds-the-Hard-Way
View on GitHub
Localizing Visual Sounds the Hard Way
☆84Jul 6, 2022Updated 4 years ago
FloretCat / CMRAN
View on GitHub
Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization， ACM MM 2020
☆33Nov 6, 2020Updated 5 years ago
KawhiZhao / Egocentric-Audio-Visual-Speaker-Localization
View on GitHub
Code for paper Audio Visual Speaker Localization from EgoCentric Views
☆11Jul 3, 2024Updated 2 years ago
stoneMo / DeepAVFusion
View on GitHub
Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".
☆43Aug 2, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
JiabenChen / iQuery
View on GitHub
[CVPR 2023] iQuery: Instruments as Queries for Audio-Visual Sound Separation
☆73Jul 25, 2023Updated 3 years ago
yannqi / COMBO-AVS
View on GitHub
[CVPR 2024 Highlight] Official implementation of the paper: Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-…
☆40Apr 20, 2025Updated last year
YuanGongND / cav-mae
View on GitHub
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".
☆292Mar 20, 2024Updated 2 years ago
hhc1997 / vggsound_download
View on GitHub
download the vggsound dataset
☆22Feb 22, 2022Updated 4 years ago
YapengTian / AVE-ECCV18
View on GitHub
Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018
☆210Apr 3, 2021Updated 5 years ago
DTaoo / Discriminative-Sounding-Objects-Localization
View on GitHub
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)
☆61Jan 19, 2022Updated 4 years ago
ruohaoguo / avis
View on GitHub
[CVPR 2025] 🔥 Official impl. of "Audio-Visual Instance Segmentation".
☆52Jun 5, 2025Updated last year
zihuixue / MFH
View on GitHub
[ICLR 23 oral] The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation
☆44Jul 10, 2023Updated 3 years ago
YapengTian / AVVP-ECCV20
View on GitHub
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)
☆90Jul 25, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
facebookresearch / EgoT2
View on GitHub
Code release for the paper "Egocentric Video Task Translation" (CVPR 2023 Highlight)
☆34Jun 12, 2023Updated 3 years ago
IFICL / stereocrw
View on GitHub
Code for the Paper: [ECCV2022] Sound Localization by Self-Supervised Time-Delay Estimation
☆28Mar 15, 2023Updated 3 years ago
zeroone-universe / TowardsRobustSpeechSR
View on GitHub
Unofficial Pytorch Lightning Implementation of "Towards Robust Speech Super-Resolution"
☆10May 8, 2023Updated 3 years ago
sony / audio-visual-seld-dcase2023
View on GitHub
Baseline method for audio-visual sound event localization and detection task of DCASE 2023 challenge
☆68Mar 19, 2025Updated last year
shlizee / savvy
View on GitHub
Repository for SAVVY(Spatial Awareness via Audio-Visual LLMs through Seeing and Hearing) Benchmark and SAVVY model
☆25May 30, 2026Updated last month
SAGNIKMJR / move2hear-active-AV-separation
View on GitHub
Code and datasets for 'Move2Hear: Active Audio-Visual Source Separation' (ICCV 2021)
☆16Jun 17, 2026Updated last month
lqiang67 / generative-models-on-toys
View on GitHub
generative models on toys
☆12Sep 10, 2024Updated last year