hohsiangwu/rethinking-visual-sound-localization

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hohsiangwu/rethinking-visual-sound-localization)

hohsiangwu / rethinking-visual-sound-localization

Official implementation of the paper How to Listen? Rethinking Visual Sound Localization

☆18

Alternatives and similar repositories for rethinking-visual-sound-localization

Users that are interested in rethinking-visual-sound-localization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

adrianSRoman / DeepWaveTorch
View on GitHub
DeepWave: A Recurrent Neural-Network for Real-Time Acoustic Imaging (PyTorch implementation)
☆23Jul 4, 2024Updated 2 years ago
stoneMo / EZ-VSL
View on GitHub
Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)
☆42Oct 2, 2022Updated 3 years ago
ardasnck / learning_to_localize_sound_source
View on GitHub
Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes
☆102Dec 4, 2024Updated last year
rabitt / ismir-2021-tutorial-case-studies
View on GitHub
Code for the ISMIR 2021 tutorial "Programming MIR Baselines from Scratch: Three Cases Studies"
☆30Nov 21, 2021Updated 4 years ago
DTaoo / Simplified_DMC
View on GitHub
A simplified version for DMC (Deep Multimodal Clustering for Unsupervised Audiovisual Learning)
☆19May 27, 2020Updated 6 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
JaesungHuh / av-diarization
View on GitHub
Audio-visual diarization pipeline used for creating VoxConverse dataset
☆22Jun 6, 2025Updated last year
zghafoori / Deep-Multi-Sphere-SVDD
View on GitHub
☆17Nov 22, 2022Updated 3 years ago
JaniceWuo / PoetryQA
View on GitHub
结合知识图谱做的有关诗词的问答demo
☆11Mar 11, 2020Updated 6 years ago
IFICL / SLfM
View on GitHub
Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
☆43Jul 16, 2026Updated last week
micarraylib / micarraylib
View on GitHub
Software for Reproducible Aggregation, Standardization, and Signal Processing of Microphone Array Datasets
☆20Apr 18, 2022Updated 4 years ago
AnWang-AI / AugABSA
View on GitHub
This repository contains codes for *Sem 2023 paper “Generative Data Augmentation for Aspect Sentiment Quad Prediction”.
☆10May 30, 2023Updated 3 years ago
wxy16 / SSFNet
View on GitHub
SSFNet：Lightweight real-time network for semantic segmentation of UAV remote sensing images
☆12Apr 20, 2024Updated 2 years ago
yammadev / cbrs
View on GitHub
Case-based Reasoning (CBR) System
☆10Apr 8, 2020Updated 6 years ago
solashirai / WWW-EvCBR
View on GitHub
Code associated with the WWW'23 paper "Event Prediction using Case-Based Reasoning over Knowledge Graphs"
☆18Nov 8, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
behzadhaki / MonotonicGrooveTransformer
View on GitHub
☆24Mar 1, 2023Updated 3 years ago
karreny / telling-left-from-right
View on GitHub
Project website for "Telling left from right: Learning spatial correspondence between sight and sound"
☆29Jun 6, 2022Updated 4 years ago
ChaitanyaKher / Sound-Localization-using-Microphone-Array-Beamforming
View on GitHub
This was a project that I did for my course. The project finds the direction of arrival and displays it using polar plots. If you have an…
☆19Jul 18, 2019Updated 7 years ago
km1994 / nlp_paper_study_bert
View on GitHub
该仓库主要记录 NLP 算法工程师相关的顶会论文研读笔记【Bert篇】
☆13Apr 4, 2023Updated 3 years ago
DTaoo / Discriminative-Sounding-Objects-Localization
View on GitHub
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)
☆61Jan 19, 2022Updated 4 years ago
afourast / avobjects
View on GitHub
Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"
☆114Nov 16, 2020Updated 5 years ago
brian7685 / Multimodal-Clustering-Network
View on GitHub
ICCV 2021
☆34May 11, 2022Updated 4 years ago
zjsong / SSPL
View on GitHub
PyTorch code for "Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes" (CVPR, 2022…
☆32Jul 8, 2024Updated 2 years ago
mdx-tutorial / mdx-tutorial.github.io
View on GitHub
Tutorial covering Open Source tools for Source Separation.
☆15Nov 12, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
psaylor / spoke
View on GitHub
A framework for building speech-enabled websites.
☆10Jul 10, 2015Updated 11 years ago
jpeper / GEN_SCL_NAT
View on GitHub
☆22Mar 28, 2024Updated 2 years ago
MaoYuxin25 / TransformerSOD
View on GitHub
A unified framework for Transformer supervised/weakly supervised SOD, RGBD SOD, COD
☆44Feb 4, 2022Updated 4 years ago
joann8512 / SusPedal-Gen
View on GitHub
This is the repository for Learning to Generate Piano Music With Sustain Pedals
☆12Nov 23, 2023Updated 2 years ago
hche11 / Localizing-Visual-Sounds-the-Hard-Way
View on GitHub
Localizing Visual Sounds the Hard Way
☆84Jul 6, 2022Updated 4 years ago
nikvaessen / disjoint-mtl
View on GitHub
Research code for "Towards multi-task learning of speech and speaker recognition" at https://arxiv.org/pdf/2302.12773.pdf
☆12Dec 2, 2024Updated last year
jwr1995 / WD-TCN
View on GitHub
☆11Aug 5, 2022Updated 3 years ago
wayne391 / 2018_Deep_Learning_HW-Image_SR
View on GitHub
SISR: RDN & RCAN
☆11Dec 23, 2024Updated last year
BoyuanChen / boombox
View on GitHub
Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations
☆15May 18, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Bartelds / ctc-dro
View on GitHub
Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.
☆17May 16, 2025Updated last year
DTaoo / DMC
View on GitHub
Code for Deep Multimodal Clustering for Unsupervised Audiovisual Learning (CVPR2019)
☆15May 27, 2020Updated 6 years ago
Saurabhbhati / DASS
View on GitHub
☆12Apr 26, 2025Updated last year
hmt2014 / AspectQuad
View on GitHub
Experimental code and data of paper "Improving Aspect Sentiment Quad Prediction via Template-Order Data Augmentation", EMNLP 2022
☆18Dec 3, 2022Updated 3 years ago
biboamy / IAD
View on GitHub
☆10Aug 2, 2019Updated 6 years ago
rxtan2 / video-grounding-narrations
View on GitHub
☆12Mar 12, 2023Updated 3 years ago
soCzech / LookForTheChange
View on GitHub
Code for Look for the Change paper published at CVPR 2022
☆36Oct 26, 2022Updated 3 years ago