HS-YN/PanoAVQA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/HS-YN/PanoAVQA)

HS-YN / PanoAVQA

Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)

☆16

Alternatives and similar repositories for PanoAVQA

Users that are interested in PanoAVQA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

GeWu-Lab / MUSIC-AVQA
View on GitHub
MUSIC-AVQA, CVPR2022 (ORAL)
☆100Dec 30, 2022Updated 3 years ago
HS-YN / PAVER
View on GitHub
Official repository of Panoramic Vision Transformer for Saliency Detection in 360° Videos (ECCV 2022)
☆43Nov 7, 2022Updated 3 years ago
l3das / L3DAS23
View on GitHub
Official repository supporting the L3DAS23 IEEE ICASSP Grand Challenge
☆16Feb 10, 2023Updated 3 years ago
DTaoo / DMC
View on GitHub
Code for Deep Multimodal Clustering for Unsupervised Audiovisual Learning (CVPR2019)
☆15May 27, 2020Updated 6 years ago
ExplainableML / AVCA-GZSL
View on GitHub
This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and …
☆43Nov 29, 2022Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
IFICL / SLfM
View on GitHub
Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
☆43Jul 16, 2026Updated last week
swimmiing / ACL-SSL
View on GitHub
Repository of the IJCV'26 & WACV'24 paper
☆34Apr 27, 2026Updated 2 months ago
DTaoo / Simplified_DMC
View on GitHub
A simplified version for DMC (Deep Multimodal Clustering for Unsupervised Audiovisual Learning)
☆19May 27, 2020Updated 6 years ago
alvinliu0 / Visual-Sound-Localization-in-the-Wild
View on GitHub
Code for Visual Sound Localization in the Wild by Cross-Modal Interference Erasing (AAAI 2022).
☆29Feb 15, 2022Updated 4 years ago
jaeyeonkim99 / visage
View on GitHub
Official implementation of "ViSAGe: Video-to-Spatial AUdio Generation" (ICLR 2025)
☆47Sep 10, 2025Updated 10 months ago
SheldonTsui / PseudoBinaural_CVPR2021
View on GitHub
Codebase for the paper "Visually Informed Binaural Audio Generation without Binaural Audios" (CVPR 2021)
☆72Jul 8, 2021Updated 5 years ago
rxtan2 / AVSeT
View on GitHub
☆17Oct 2, 2023Updated 2 years ago
bckim92 / colloquial-claims
View on GitHub
✅ How Robust are Fact Checking Systems on Colloquial Claims?. In NAACL-HLT, 2021.
☆22Jul 1, 2021Updated 5 years ago
ahnjaewoo / MPCHAT
View on GitHub
📸 Code and Dataset for our ACL 2023 paper: "MPCHAT: Towards Multimodal Persona-Grounded Conversation"
☆22Sep 5, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
weiguoPian / AV-CIL_ICCV2023
View on GitHub
[ICCV 2023] Audio-Visual Class-Incremental Learning
☆35Sep 29, 2024Updated last year
DTaoo / Discriminative-Sounding-Objects-Localization
View on GitHub
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)
☆61Jan 19, 2022Updated 4 years ago
YapengTian / CCOL-CVPR21
View on GitHub
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
☆26Nov 24, 2021Updated 4 years ago
facebookresearch / learning-audio-visual-dereverberation
View on GitHub
Code for paper Learning Audio-Visual Dereverberation
☆32Aug 10, 2022Updated 3 years ago
9rum / flatflow
View on GitHub
Fast and exact parallel training of neural networks
☆13Updated this week
IFICL / stereocrw
View on GitHub
Code for the Paper: [ECCV2022] Sound Localization by Self-Supervised Time-Delay Estimation
☆28Mar 15, 2023Updated 3 years ago
v-manhlt3 / m-LTM-Audio-Text-Retrieval
View on GitHub
☆13Jan 5, 2025Updated last year
ardasnck / learning_to_localize_sound_source
View on GitHub
Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes
☆102Dec 4, 2024Updated last year
pedro-morgado / spatialaudiogen
View on GitHub
Spatial Audio Generation
☆117Mar 24, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jczhang02 / MUSIC_dataset_script
View on GitHub
This repo contains script to download MUSIC dataset from youtube
☆12Jan 19, 2024Updated 2 years ago
OpenNLPLab / MMVAE-AVS
View on GitHub
Multimodal Variational Auto-encoder based Audio-Visual Segmentation [ICCV2023].
☆20Sep 19, 2024Updated last year
stoneMo / MGN
View on GitHub
Official implementation for MGN
☆20Dec 22, 2022Updated 3 years ago
ecrireme / SPR
View on GitHub
Official Repository for our ICCV2021 paper: Continual Learning on Noisy Data Streams via Self-Purified Replay
☆31Jan 5, 2022Updated 4 years ago
limuhit / pseudocylindrical_convolution
View on GitHub
Pseudocylindrical convolutions for Learned Omnidirectional Image Compression
☆13Jan 16, 2026Updated 6 months ago
XYPB / CondFoleyGen
View on GitHub
Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".
☆93Dec 8, 2023Updated 2 years ago
ruohaoguo / avis
View on GitHub
[CVPR 2025] 🔥 Official impl. of "Audio-Visual Instance Segmentation".
☆52Jun 5, 2025Updated last year
V-Sense / 360AudioVisual
View on GitHub
This repository contains materials for the paper: Towards generating ambisonics using audio-visual cue for virtual reality
☆13Jul 2, 2019Updated 7 years ago
ku-vai / TPoS
View on GitHub
This repository is for The Power of Sound(TPoS): Audio Reactive Video Generation with Stable Diffusion (ICCV2023)
☆25Dec 7, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
WikiChao / Ego-AV-Loc
View on GitHub
[CVPR 2023] Egocentric Audio-Visual Object Localization
☆27Jan 6, 2024Updated 2 years ago
keithnoguchi / do-in-action
View on GitHub
DO with Terraform and Ansible
☆11Jun 5, 2018Updated 8 years ago
showlab / mist
View on GitHub
☆37Dec 20, 2023Updated 2 years ago
ai4cmb / NNhealpix
View on GitHub
Neural networks on the Healpix sphere
☆18Nov 14, 2023Updated 2 years ago
skywalker023 / pragmatic-consistency
View on GitHub
🤖 Code for our EMNLP 2020 paper: "Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness"
☆37Oct 12, 2020Updated 5 years ago
MSR-LIT / MultilingualBias
View on GitHub
☆10Jul 6, 2023Updated 3 years ago
passing2961 / DialogCC
View on GitHub
Official code and dataset for our NAACL 2024 paper: DialogCC: An Automated Pipeline for Creating High-Quality Multi-modal Dialogue Datase…
☆13Jun 24, 2024Updated 2 years ago