ardasnck / learning_to_localize_sound_sourceLinks

Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes

☆92

Alternatives and similar repositories for learning_to_localize_sound_source

Users that are interested in learning_to_localize_sound_source are comparing it to the libraries listed below

Sorting:

rhgao / co-separation
Co-Separating Sounds of Visual Objects (ICCV 2019)
☆96Updated 2 years ago
hche11 / Localizing-Visual-Sounds-the-Hard-Way
Localizing Visual Sounds the Hard Way
☆81Updated 3 years ago
pedro-morgado / AVSpatialAlignment
☆29Updated 3 years ago
shvdiwnkozbw / Multi-Source-Sound-Localization
This repo aims to perform sound localization in complex audiovisual scenes, where there multiple objects making sounds.
☆85Updated 3 years ago
DTaoo / Discriminative-Sounding-Objects-Localization
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)
☆58Updated 3 years ago
hche11 / VGGSound
VGGSound: A Large-scale Audio-Visual Dataset
☆324Updated 3 years ago
facebookresearch / EasyComDataset
The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmente…
☆120Updated last year
SheldonTsui / SepStereo_ECCV2020
Codebase for the paper "Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation" (ECCV2020)
☆72Updated 4 years ago
YapengTian / AVE-ECCV18
Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018
☆186Updated 4 years ago
afourast / avobjects
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
☆113Updated 4 years ago
ekazakos / auditory-slow-fast
Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch
☆74Updated 3 years ago
YapengTian / CCOL-CVPR21
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
☆25Updated 3 years ago
roudimit / MUSIC_dataset
MUSIC Dataset from The Sound of Pixels (ECCV '18)
☆129Updated 2 years ago
kyuyeonpooh / objects-that-sound
The unofficial implementation of paper, "Objects that Sound", from ECCV 2018.
☆31Updated last year
stoneMo / AVGN
Official implementation for AVGN
☆35Updated 2 years ago
akoepke / audio-retrieval-benchmark
Implementation of "Audio Retrieval with Natural Language Queries: A Benchmark Study".
☆51Updated 3 weeks ago
facebookresearch / 2.5D-Visual-Sound
2.5D visual sound
☆114Updated 2 years ago
Yu-Wu / Modaily-Aware-Audio-Visual-Video-Parsing
Code for CVPR 2021 paper Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing
☆24Updated 3 years ago
stoneMo / SLAVC
Official Codebase of "A Closer Look at Weakly-Supervised Audio-Visual Source Localization" (NeurIPS 2022)
☆19Updated 2 years ago
stoneMo / EZ-VSL
Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)
☆35Updated 2 years ago
YapengTian / AVVP-ECCV20
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)
☆88Updated last year
zjsong / SSPL
PyTorch code for "Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes" (CVPR, 2022…
☆32Updated last year
facebookresearch / FAIR-Play
2.5D visual sound dataset
☆99Updated 3 years ago
ubc-vision / TriBERT
Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation" in NeurIPS…
☆14Updated 3 years ago
denfed / heartheflow
Repository for the 2023 WACV paper: "Hear The Flow: Optical Flow-Based Self-Supervised Visual Sound Source Localization"
☆11Updated 2 years ago
FloretCat / CMRAN
Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization， ACM MM 2020
☆33Updated 4 years ago
ms-dot-k / Visual-Audio-Memory
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
☆20Updated 3 years ago
kkoutini / cpjku_dcase19
CP-JKU submission to DCASE 19, performant single-model CNN
☆57Updated 4 years ago
EGO4D / audio-visual
☆66Updated 2 years ago
joannahong / AV-RelScore
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…
☆34Updated 2 years ago