VisualAIKHU/SIRA-SSL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/VisualAIKHU/SIRA-SSL)

VisualAIKHU / SIRA-SSL

Official Repository for "Audio-Visual Spatial Integration and Recursive Attention for Robust Sound Source Localization" (ACM MM 2023)

☆18

Alternatives and similar repositories for SIRA-SSL

Users that are interested in SIRA-SSL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

VisualAIKHU / NoPrior_MultiSSL
View on GitHub
Official Repository for "Learning to Visually Localize Sound Sources from Mixtures without Prior Source Knowledge" (CVPR 2024)
☆16Sep 1, 2024Updated last year
denfed / heartheflow
View on GitHub
Repository for the 2023 WACV paper: "Hear The Flow: Optical Flow-Based Self-Supervised Visual Sound Source Localization"
☆12Dec 21, 2022Updated 3 years ago
VisualAIKHU / Missing-AVQA
View on GitHub
Official Repository for "Learning Trimodal Relation for Audio-Visual Question Answering with Missing Modality" (ECCV 2024)
☆16Oct 29, 2024Updated last year
VisualAIKHU / SAMPD
View on GitHub
Official Repository for "Multispectral Pedestrian Detection with Sparsely Annotated Label" (AAAI 2025)
☆32Apr 28, 2025Updated last year
GeWu-Lab / PSTP-Net
View on GitHub
☆17Aug 11, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
GeWu-Lab / Generalizable-Audio-Visual-Segmentation
View on GitHub
Official repository of "Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer", AAAI 2024
☆28Mar 14, 2026Updated 4 months ago
hohsiangwu / rethinking-visual-sound-localization
View on GitHub
Official implementation of the paper How to Listen? Rethinking Visual Sound Localization
☆18Apr 25, 2022Updated 4 years ago
BingYang-20 / DP-RTF-Learning
View on GitHub
A python implementation of “Learning Deep Direct-Path Relative Transfer Function for Binaural Sound Source Localization” [TASLP 2021]
☆28Feb 11, 2023Updated 3 years ago
sangmin-git / MMSI
View on GitHub
Code for "Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations" (CVPR 2024 Oral)
☆19Jun 23, 2024Updated 2 years ago
hche11 / Localizing-Visual-Sounds-the-Hard-Way
View on GitHub
Localizing Visual Sounds the Hard Way
☆84Jul 6, 2022Updated 4 years ago
IFICL / SLfM
View on GitHub
Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
☆43Updated this week
zjsong / SSPL
View on GitHub
PyTorch code for "Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes" (CVPR, 2022…
☆32Jul 8, 2024Updated 2 years ago
joannahong / AV-RelScore
View on GitHub
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…
☆35Jun 20, 2023Updated 3 years ago
speedyseal / audiosetdl
View on GitHub
Scripts for download AudioSet
☆89Nov 7, 2017Updated 8 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
3DHCG / Jittor_DiffPoseTalk
View on GitHub
Jittor implementation of DiffPoseTalk(SIGGRAPH 2024)
☆25Nov 11, 2024Updated last year
yammadev / cbrs
View on GitHub
Case-based Reasoning (CBR) System
☆10Apr 8, 2020Updated 6 years ago
solashirai / WWW-EvCBR
View on GitHub
Code associated with the WWW'23 paper "Event Prediction using Case-Based Reasoning over Knowledge Graphs"
☆18Nov 8, 2023Updated 2 years ago
ChaitanyaKher / Sound-Localization-using-Microphone-Array-Beamforming
View on GitHub
This was a project that I did for my course. The project finds the direction of arrival and displays it using polar plots. If you have an…
☆19Jul 18, 2019Updated 7 years ago
MayukeM / DataStructure_KG
View on GitHub
数据结构知识图谱
☆19Jun 18, 2021Updated 5 years ago
advanc3dUA / WohnungSuchen
View on GitHub
🏠🔍 Auto check for new apartments in Hamburg from various real estate provides
☆16Apr 15, 2026Updated 3 months ago
swimmiing / ACL-SSL
View on GitHub
Repository of the IJCV'26 & WACV'24 paper
☆34Apr 27, 2026Updated 2 months ago
ByungKwanLee / YOLO-Dyanmic-ROS
View on GitHub
Modification to YOLO for improving Dynamic Real-Time Processing on Robotics Operating Systems for Autonomous Vehicle System
☆21Feb 16, 2022Updated 4 years ago
BingYang-20 / SRP-DNN
View on GitHub
A python implementation of “SRP-DNN: Learning Direct-Path Phase Difference for Multiple Moving Sound Source Localization” [ICASSP 2022]
☆66Sep 28, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
EmreAydinn / Package-program-for-image-segmentation-algorithms.
View on GitHub
There are 4 different segmentation algorithms available. These : watershed segmentation, Split and Merge, Active Contour, Region Segmanta…
☆14Jun 12, 2017Updated 9 years ago
SOUNDS-RESEARCH / complex_neural_source_localization
View on GitHub
Complex-valued neural networks for DOA estimation
☆31Jan 25, 2023Updated 3 years ago
PRIS-CV / Category-Specific-Prompt
View on GitHub
Code release for "Category-Specific Prompts for Animal Action Recognition with Pretrained Vision-Language Models"
☆14Feb 21, 2024Updated 2 years ago
idansc / simple-avsd
View on GitHub
Code for ''A Simple Baseline for Audio-Visual Scene-Aware Dialog``
☆27May 26, 2020Updated 6 years ago
zihuixue / MKE
View on GitHub
[ICCV 2021] Multimodal Knowledge Expansion
☆10Aug 28, 2021Updated 4 years ago
ByungKwanLee / Robust-TopView
View on GitHub
Robustly Converting Camera View from Normal View to Top View for Autonomous Vehicle System on Robotics Operating System (ROS)
☆24Jan 29, 2020Updated 6 years ago
ByungKwanLee / Adavanced-ECMS
View on GitHub
Advanced Energy Control Management System (Advanced-ECMS) for Electrical Vehicle System using proposed Plus Version of Alternating Direct…
☆29Feb 15, 2022Updated 4 years ago
mispchallenge / MISP2021-AVSR
View on GitHub
repository for paper "Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis"
☆18Jun 17, 2022Updated 4 years ago
FYJNEVERFOLLOWS / Awesome-Sound-Source-Localization
View on GitHub
A tutorial for Sound Source Localization researchers and practitioners. The purpose of this repo is to organize the world’s resources for…
☆59Mar 17, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ByungKwanLee / Double-Debiased-Adversary
View on GitHub
[ICCV 2023] Official PyTorch Implementation for "Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial …
☆31Oct 13, 2023Updated 2 years ago
stoneMo / SLAVC
View on GitHub
Official Codebase of "A Closer Look at Weakly-Supervised Audio-Visual Source Localization" (NeurIPS 2022)
☆21Dec 6, 2022Updated 3 years ago
VIM-Bench / VIM_TOOL
View on GitHub
☆12Jun 12, 2024Updated 2 years ago
KoDohwan / VT-TWINS
View on GitHub
Video-Text Representation Learning via Differentiable Weak Temporal Alignment (PyTorch implementation for the CVPR 2022 paper)
☆11Oct 12, 2022Updated 3 years ago
OpenNLPLab / MMVAE-AVS
View on GitHub
Multimodal Variational Auto-encoder based Audio-Visual Segmentation [ICCV2023].
☆20Sep 19, 2024Updated last year
sauradip / MUPPET
View on GitHub
[ Arxiv 2023 ] This repository contains the code for "MUPPET: Multi-Modal Few-Shot Temporal Action Detection"
☆16Aug 30, 2023Updated 2 years ago
JustinYuu / MM_Pyramid
View on GitHub
[ACM MM 2022] MM_Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing
☆15Aug 26, 2022Updated 3 years ago