dberghi/AV-SELD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dberghi/AV-SELD)

dberghi / AV-SELD

Python implementation of the paper "Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection"

☆31

Alternatives and similar repositories for AV-SELD

Users that are interested in AV-SELD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

partha2409 / DCASE2024_seld_baseline
View on GitHub
☆52Dec 13, 2025Updated 7 months ago
SonyResearch / dcase2025_stereo_seld_data_generator
View on GitHub
Data generator for stereo sound event localization and detection task of DCASE 2025 challenge
☆17Jul 17, 2025Updated last year
Orlllem / seld_wav2vec2
View on GitHub
☆18Feb 1, 2026Updated 5 months ago
Hong-Hengyi / MVANet-SELD
View on GitHub
For more detailed information, please refer to the paper titled "MVANet: Multi-Stage Video Attention Network for Sound Event Localization…
☆35May 20, 2025Updated last year
marl / SpatialScaper
View on GitHub
☆75Aug 7, 2025Updated 11 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
partha2409 / DCASE2025_seld_baseline
View on GitHub
☆27May 27, 2025Updated last year
axeber01 / ngcc-seld
View on GitHub
Sound Event Localization and Detection using Neural Generalized Cross-Correlations
☆36Feb 11, 2025Updated last year
aromanusc / SoundQ
View on GitHub
Enhanced sound event localization and detection in real 360-degree audio-visual soundscapes (DCASE task3 format)
☆14Mar 21, 2025Updated last year
danielkrause / DCASE2022-data-generator
View on GitHub
Data generator for creating synthetic audio mixtures suitable for DCASE Challenge 2022 Task 3
☆47Apr 5, 2023Updated 3 years ago
yusunnny / CST-former
View on GitHub
CST-former: Transformer with Channel-Spectro-Temporal Attention for Sound Event Localization and Detection (ICASSP 2024)
☆39May 20, 2025Updated last year
Jinbo-Hu / PSELDNets
View on GitHub
PSELDNets: Pre-trained Neural Networks on Large-scale Synthetic Datasets for Sound Event Localization and Detection
☆47Sep 17, 2025Updated 10 months ago
Jinbo-Hu / L3DAS22-TASK2
View on GitHub
A Track-Wise Ensemble Event Independent Network for 3D Polyphonic Sound Event Localization and Detection
☆23Nov 14, 2024Updated last year
Jinbo-Hu / DCASE2022-TASK3
View on GitHub
☆37Nov 14, 2024Updated last year
michaelneri / audio-distance-estimation
View on GitHub
Official repository of the work "Speaker Distance Estimation in Enclosures from Single-Channel Audio" published to IEEE/ACM Transactions …
☆40Jun 29, 2026Updated 3 weeks ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sony / audio-visual-seld-dcase2023
View on GitHub
Baseline method for audio-visual sound event localization and detection task of DCASE 2023 challenge
☆68Mar 19, 2025Updated last year
Audio-WestlakeU / SAR-SSL
View on GitHub
A python implementation of “Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Mult…
☆40Oct 11, 2024Updated last year
swagshaw / WildDESED
View on GitHub
WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection
☆18Nov 19, 2024Updated last year
danielkrause / Moving-Binaural-SDEL
View on GitHub
Implementation of the paper "Binaural Sound Source Distance Estimation and Localization for a Moving Listener"
☆22Mar 2, 2025Updated last year
kaiw7 / STG-CMA
View on GitHub
Towards Efficient Audio-Visual Learners via Empowering Pre-trained Vision Transformers with Cross-Modal Adaptation
☆15Apr 13, 2024Updated 2 years ago
Sara-Ahmed / ASiT
View on GitHub
ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation
☆30Mar 10, 2024Updated 2 years ago
sharathadavanne / seld-dcase2023
View on GitHub
Baseline method for sound event localization task of DCASE 2023 challenge
☆71Mar 13, 2023Updated 3 years ago
BASHLab / OWL
View on GitHub
☆15May 25, 2026Updated 2 months ago
thomeou / SALSA
View on GitHub
This is the public repository for eigenvector-based SALSA features for polyphonic sound event localization and detection.
☆114May 31, 2022Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
zszheng147 / Spatial-AST
View on GitHub
🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)
☆87Feb 13, 2025Updated last year
cai525 / Transformer4SED
View on GitHub
This repository aims to collect Transformer-based sound event detection (SED) algorithms.
☆104Feb 10, 2026Updated 5 months ago
juliawilkins / ambisonics2binaural_simple
View on GitHub
A simple Python script to convert FOA audio to binaural.
☆17Nov 29, 2022Updated 3 years ago
yxdong0320 / Solution_on_3D_SELD
View on GitHub
The program ranked first in Audio-only track of DCASE2024 Challenge task3.
☆22Mar 2, 2026Updated 4 months ago
Audio-WestlakeU / FN-SSL
View on GitHub
The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization [INTERSPEECH2023 & TASLP2024]
☆159Mar 10, 2026Updated 4 months ago
sharathadavanne / seld-net
View on GitHub
Sound event localization, detection, and tracking of multiple overlapping and moving sources in 2D spherical space using convolutional re…
☆405Nov 21, 2022Updated 3 years ago
CARNIVAL-IITP / Noise_suppression
View on GitHub
☆35Feb 14, 2025Updated last year
francesclluis / direction-ambisonics-source-separation
View on GitHub
Deep learning for directional sound source separation from Ambisonics mixtures.
☆31Oct 1, 2022Updated 3 years ago
yinkalario / Two-Stage-Polyphonic-Sound-Event-Detection-and-Localization
View on GitHub
A two-stage polyphonic sound event detection and localization method for both SED and DOA.
☆126Jan 8, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
QinYang12 / SVGC-AVA
View on GitHub
☆14Aug 17, 2024Updated last year
sadPororo / AD-YOLO
View on GitHub
AD-YOLO: You Look Only Once in Training Multiple Sound Event Localization and Detection, IEEE ICASSP 2023
☆35Dec 21, 2025Updated 7 months ago
muuda / MFF-EINV2
View on GitHub
MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection
☆22Jul 17, 2024Updated 2 years ago
stoneMo / DeepAVFusion
View on GitHub
Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".
☆43Aug 2, 2024Updated last year
yinkalario / EIN-SELD
View on GitHub
An Improved Event-Independent Network for Polyphonic Sound Event Localization and Detection
☆79Aug 5, 2021Updated 4 years ago
RBenita / DIFFAR
View on GitHub
Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation
☆32Mar 8, 2024Updated 2 years ago
sharathadavanne / seld-dcase2022
View on GitHub
Baseline method for sound event localization task of DCASE 2022 challenge
☆64Jun 21, 2022Updated 4 years ago