hohsiangwu / rethinking-visual-sound-localizationView external linksLinks
Official implementation of the paper How to Listen? Rethinking Visual Sound Localization
☆17Apr 25, 2022Updated 3 years ago
Alternatives and similar repositories for rethinking-visual-sound-localization
Users that are interested in rethinking-visual-sound-localization are comparing it to the libraries listed below
Sorting:
- pytorch code for sound event localization and classification☆13Aug 12, 2021Updated 4 years ago
- Code for the ISMIR 2021 tutorial "Programming MIR Baselines from Scratch: Three Cases Studies"☆30Nov 21, 2021Updated 4 years ago
- ☆17Nov 22, 2022Updated 3 years ago
- Official Repository for "Audio-Visual Spatial Integration and Recursive Attention for Robust Sound Source Localization" (ACM MM 2023)☆18Nov 14, 2023Updated 2 years ago
- Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes☆97Dec 4, 2024Updated last year
- DeepWave: A Recurrent Neural-Network for Real-Time Acoustic Imaging (PyTorch implementation)☆23Jul 4, 2024Updated last year
- Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)☆39Oct 2, 2022Updated 3 years ago
- Audio-visual diarization pipeline used for creating VoxConverse dataset☆21Jun 6, 2025Updated 8 months ago
- PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "☆40Dec 15, 2020Updated 5 years ago
- Question-Aware Gaussian Experts for Audio-Visual Question Answering -- Official Pytorch Implementation (CVPR'25, Highlight)☆26Jun 6, 2025Updated 8 months ago
- [2025 CVPR] Towards Open-Vocabulary Audio-Visual Event Localization☆40Mar 7, 2025Updated 11 months ago
- ☆27Mar 21, 2024Updated last year
- Project website for "Telling left from right: Learning spatial correspondence between sight and sound"☆25Jun 6, 2022Updated 3 years ago
- Code for Discriminative Sounding Objects Localization (NeurIPS 2020)☆59Jan 19, 2022Updated 4 years ago
- Implementation for ECCV20 paper "Self-Supervised Learning of audio-visual objects from video"☆115Nov 16, 2020Updated 5 years ago
- Code for Look for the Change paper published at CVPR 2022☆36Oct 26, 2022Updated 3 years ago
- Repository of the WACV'24 paper "Can CLIP Help Sound Source Localization?"☆34Feb 21, 2025Updated 11 months ago
- Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation☆41Dec 23, 2023Updated 2 years ago
- Collection of Deep Reinforcement Learning Jupyter Notebooks. Each notebook is self-contained and presents single algorithm. These include…☆38Mar 7, 2020Updated 5 years ago
- code for our ACM MM 2020 best paper "PiRhDy: Learning Pitch-, Rhythm-, and Dynamics-aware Embeddings for Symbolic Music"☆32Mar 13, 2022Updated 3 years ago
- ICCV 2021☆34May 11, 2022Updated 3 years ago
- Data generator for creating synthetic audio mixtures suitable for DCASE Challenge 2022 Task 3☆44Apr 5, 2023Updated 2 years ago
- ☆15Apr 7, 2025Updated 10 months ago
- 为了方便大家考研☆10Sep 8, 2021Updated 4 years ago
- script to extract frames from HMDB51 dataset and create train, test and val split☆10Feb 26, 2019Updated 6 years ago
- Multi-Agent LLM System for Digital Scam Protection☆12Dec 19, 2024Updated last year
- Official repository for "DYPLOC: Dynamic Planning of Content Using Mixed Language Models for Opinion Text Generation"☆10May 20, 2022Updated 3 years ago
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆19Nov 3, 2025Updated 3 months ago
- Repository for the code assignment of the Deep Learning 1 course, Fall 2021 edition☆10Oct 31, 2022Updated 3 years ago
- Residual Quantization Autoencoder, used for interpreting LLMs☆13Jan 1, 2025Updated last year
- Official codebase for "Context Aware Deep Learning for Multi Modal Depression Detection" [ICASSP 2019, Oral]☆11Dec 26, 2024Updated last year
- Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting☆12Mar 24, 2023Updated 2 years ago
- BachDuet enables a human performer to improvise a duet counterpoint with a computer agent in real time.☆14Aug 8, 2022Updated 3 years ago
- ☆12Apr 26, 2025Updated 9 months ago
- Notes for CS294/194-196: Large Language Model Agents (Fall 2024, UC Berkeley), summarizing 12 lectures on LLM fundamentals, reasoning, pl…☆14Jan 7, 2025Updated last year
- ☆10Jun 13, 2022Updated 3 years ago
- ☆10Jun 3, 2019Updated 6 years ago
- Official code for the paper "Does CLIP's Generalization Performance Mainly Stem from High Train-Test Similarity?" (ICLR 2024)☆10Aug 26, 2024Updated last year
- This is the accompanying repository to the paper - Automatic Estimation of Singing Voice Musical Dynamics☆15Oct 28, 2024Updated last year