DTaoo/Discriminative-Sounding-Objects-Localization

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/DTaoo/Discriminative-Sounding-Objects-Localization)

DTaoo / Discriminative-Sounding-Objects-Localization

Code for Discriminative Sounding Objects Localization (NeurIPS 2020)

☆61

Alternatives and similar repositories for Discriminative-Sounding-Objects-Localization

Users that are interested in Discriminative-Sounding-Objects-Localization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

shvdiwnkozbw / Multi-Source-Sound-Localization
View on GitHub
This repo aims to perform sound localization in complex audiovisual scenes, where there multiple objects making sounds.
☆96Oct 18, 2021Updated 4 years ago
hche11 / Localizing-Visual-Sounds-the-Hard-Way
View on GitHub
Localizing Visual Sounds the Hard Way
☆84Jul 6, 2022Updated 4 years ago
YapengTian / AVVP-ECCV20
View on GitHub
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)
☆90Jul 25, 2024Updated last year
ardasnck / learning_to_localize_sound_source
View on GitHub
Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes
☆102Dec 4, 2024Updated last year
rhgao / co-separation
View on GitHub
Co-Separating Sounds of Visual Objects (ICCV 2019)
☆98Jul 25, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
FloretCat / CMRAN
View on GitHub
Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization， ACM MM 2020
☆33Nov 6, 2020Updated 5 years ago
DTaoo / DMC
View on GitHub
Code for Deep Multimodal Clustering for Unsupervised Audiovisual Learning (CVPR2019)
☆15May 27, 2020Updated 6 years ago
Yu-Wu / Modaily-Aware-Audio-Visual-Video-Parsing
View on GitHub
Code for CVPR 2021 paper Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing
☆24Dec 29, 2021Updated 4 years ago
zjsong / SSPL
View on GitHub
PyTorch code for "Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes" (CVPR, 2022…
☆32Jul 8, 2024Updated 2 years ago
WangHelin1997 / GL-AT
View on GitHub
Pytorch implementation of the paper : A Global-local Attention Framework for Weakly Labelled Audio Tagging.
☆13Feb 6, 2021Updated 5 years ago
afourast / avobjects
View on GitHub
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
☆114Nov 16, 2020Updated 5 years ago
marmot-xy / CMBS
View on GitHub
cross modal background suppression for audio-visual event localization
☆36Mar 18, 2022Updated 4 years ago
stoneMo / SLAVC
View on GitHub
Official Codebase of "A Closer Look at Weakly-Supervised Audio-Visual Source Localization" (NeurIPS 2022)
☆21Dec 6, 2022Updated 3 years ago
facebookresearch / 2.5D-Visual-Sound
View on GitHub
2.5D visual sound
☆121Jul 25, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
HS-YN / PanoAVQA
View on GitHub
Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)
☆16Oct 12, 2021Updated 4 years ago
roudimit / MUSIC_dataset
View on GitHub
MUSIC Dataset from The Sound of Pixels (ECCV '18)
☆137Aug 12, 2022Updated 3 years ago
JuanFMontesinos / Solos
View on GitHub
Solos: A Dataset for Audio-Visual Music Analysis
☆24Feb 17, 2023Updated 3 years ago
HumamAlwassel / XDC
View on GitHub
Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)
☆91Oct 24, 2022Updated 3 years ago
hxixixh / mix-and-localize
View on GitHub
☆23Mar 20, 2024Updated 2 years ago
brian7685 / Multimodal-Clustering-Network
View on GitHub
ICCV 2021
☆34May 11, 2022Updated 4 years ago
rohitrango / objects-that-sound
View on GitHub
Unofficial Implementation of Google Deepmind's paper `Objects that Sound`
☆83May 7, 2018Updated 8 years ago
hangzhaomit / Sound-of-Pixels
View on GitHub
Codebase for ECCV18 "The Sound of Pixels"
☆393Apr 25, 2022Updated 4 years ago
GeWu-Lab / MUSIC-AVQA
View on GitHub
MUSIC-AVQA, CVPR2022 (ORAL)
☆100Dec 30, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
hohsiangwu / rethinking-visual-sound-localization
View on GitHub
Official implementation of the paper How to Listen? Rethinking Visual Sound Localization
☆18Apr 25, 2022Updated 4 years ago
DTaoo / Simplified_DMC
View on GitHub
A simplified version for DMC (Deep Multimodal Clustering for Unsupervised Audiovisual Learning)
☆19May 27, 2020Updated 6 years ago
visipedia / ssw60
View on GitHub
Sapsucker Woods 60 Audiovisual Dataset
☆19Oct 7, 2022Updated 3 years ago
fakufaku / pyramic-dataset
View on GitHub
48-Channel Anechoic Audio Recordings of 3D Sources
☆17Feb 4, 2020Updated 6 years ago
GeWu-Lab / awesome-audiovisual-learning
View on GitHub
A curated list of audio-visual learning methods and datasets.
☆288Dec 3, 2024Updated last year
ubc-vision / TriBERT
View on GitHub
Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation" in NeurIPS…
☆14Dec 9, 2021Updated 4 years ago
atsiami / STAViS
View on GitHub
Spatio-Temporal AudioVisual Saliency Network
☆56Jan 23, 2024Updated 2 years ago
valterlej / zsarcap
View on GitHub
Official code for Tell Me What You See: A Zero-Shot Action Recognition Method Based on Natural Language Descriptions (Multimedia Tools an…
☆13Mar 8, 2024Updated 2 years ago
PeihaoChen / regnet
View on GitHub
Official PyTorch implementation of the TIP paper "Generating Visually Aligned Sound from Videos" and the corresponding Visually Aligned S…
☆53Dec 15, 2020Updated 5 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
GeWu-Lab / MMCosine_ICASSP23
View on GitHub
The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"
☆26May 18, 2023Updated 3 years ago
facebookresearch / FAIR-Play
View on GitHub
2.5D visual sound dataset
☆108Sep 21, 2021Updated 4 years ago
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago
Mark12Ding / FAME
View on GitHub
[CVPR 2022] Code for Motion-aware Contrastive Video Representation Learning via Foreground-background Merging
☆51Sep 30, 2023Updated 2 years ago
TengdaHan / CoCLR
View on GitHub
[NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.
☆288Oct 10, 2021Updated 4 years ago
GeWu-Lab / Generalizable-Audio-Visual-Segmentation
View on GitHub
Official repository of "Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer", AAAI 2024
☆28Mar 14, 2026Updated 4 months ago
descriptinc / lyrebird-wav2clip
View on GitHub
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP
☆359Feb 15, 2022Updated 4 years ago