This repo aims to perform sound localization in complex audiovisual scenes, where there multiple objects making sounds.
☆96Oct 18, 2021Updated 4 years ago
Alternatives and similar repositories for Multi-Source-Sound-Localization
Users that are interested in Multi-Source-Sound-Localization are comparing it to the libraries listed below
Sorting:
- Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes☆97Dec 4, 2024Updated last year
- Code for Discriminative Sounding Objects Localization (NeurIPS 2020)☆59Jan 19, 2022Updated 4 years ago
- Code for Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization.☆10Sep 28, 2021Updated 4 years ago
- The unofficial implementation of paper, "Objects that Sound", from ECCV 2018.☆31Jan 29, 2024Updated 2 years ago
- Files for the paper: "Sound Source Localization using Deep Residual Learning"☆24Nov 13, 2017Updated 8 years ago
- Neural Network based Sound Source Localization Models☆47Aug 29, 2023Updated 2 years ago
- Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing, ECCV, 2020. (Spotlight)☆89Jul 25, 2024Updated last year
- Localizing Visual Sounds the Hard Way☆82Jul 6, 2022Updated 3 years ago
- Quaternion Neural Networks for 3D Sound Source Localization in Reverberant Environments.☆19Nov 21, 2022Updated 3 years ago
- Official Codebase of "A Closer Look at Weakly-Supervised Audio-Visual Source Localization" (NeurIPS 2022)☆20Dec 6, 2022Updated 3 years ago
- Co-Separating Sounds of Visual Objects (ICCV 2019)☆99Jul 25, 2023Updated 2 years ago
- PyTorch code for "Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes" (CVPR, 2022…☆32Jul 8, 2024Updated last year
- Sound event localization, detection, and tracking of multiple overlapping and moving sources in 2D spherical space using convolutional re…☆380Nov 21, 2022Updated 3 years ago
- Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation☆26Nov 24, 2021Updated 4 years ago
- Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark☆35May 7, 2025Updated 9 months ago
- Audio-Visual Event Localization in Unconstrained Videos, ECCV 2018☆203Apr 3, 2021Updated 4 years ago
- The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization [INTERSPEECH2023 & TASLP2024]☆141Feb 5, 2026Updated 3 weeks ago
- Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization, ACM MM 2020☆32Nov 6, 2020Updated 5 years ago
- Learning to Separate Object Sounds by Watching Unlabeled Video (ECCV 2018)☆50Sep 24, 2019Updated 6 years ago
- Real-time end-to-end singing voice convertion☆24Nov 3, 2024Updated last year
- cross modal background suppression for audio-visual event localization☆36Mar 18, 2022Updated 3 years ago
- A curated list of different papers and datasets in various areas of audio-visual processing☆766Jan 30, 2024Updated 2 years ago
- ☆22Mar 20, 2024Updated last year
- Unofficial Implementation of Google Deepmind's paper `Objects that Sound`☆83May 7, 2018Updated 7 years ago
- ☆11Aug 11, 2023Updated 2 years ago
- Official repository of the work "Low-complexity Unsupervised Audio Anomaly Detection exploiting Separable Convolutions and Angular Loss" …☆10Nov 6, 2024Updated last year
- Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)☆40Oct 2, 2022Updated 3 years ago
- Wenet speech to text for react native☆10Nov 1, 2022Updated 3 years ago
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- Implementation of Hilbert beamforming for SNN-based audio source localisation☆16Oct 2, 2024Updated last year
- text to speech☆10Mar 19, 2024Updated last year
- ☆13Jan 2, 2025Updated last year
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- ☆11Nov 7, 2024Updated last year
- ☆43Jan 13, 2025Updated last year
- Code for Visual Sound Localization in the Wild by Cross-Modal Interference Erasing (AAAI 2022).☆29Feb 15, 2022Updated 4 years ago
- soundnet and localize sound source☆12Dec 7, 2020Updated 5 years ago
- Cantonese Grapheme-to-Phoneme Converter based on GitYCC/g2pW☆15Dec 10, 2024Updated last year