KawhiZhao/Egocentric-Audio-Visual-Speaker-Localization

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/KawhiZhao/Egocentric-Audio-Visual-Speaker-Localization)

KawhiZhao / Egocentric-Audio-Visual-Speaker-Localization

Code for paper Audio Visual Speaker Localization from EgoCentric Views

☆11

Alternatives and similar repositories for Egocentric-Audio-Visual-Speaker-Localization

Users that are interested in Egocentric-Audio-Visual-Speaker-Localization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

stoneMo / OneAVM
View on GitHub
Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)
☆12Jun 1, 2023Updated 3 years ago
WikiChao / Ego-AV-Loc
View on GitHub
[CVPR 2023] Egocentric Audio-Visual Object Localization
☆27Jan 6, 2024Updated 2 years ago
facebookresearch / EasyComDataset
View on GitHub
The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmente…
☆143Dec 4, 2023Updated 2 years ago
JoaquinChou / Acousitc-Net
View on GitHub
☆16Apr 9, 2022Updated 4 years ago
axeber01 / wav2pos
View on GitHub
3D Sound Source Localization using Masked Autoencoders
☆21Feb 12, 2025Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
rxtan2 / AVSeT
View on GitHub
☆17Oct 2, 2023Updated 2 years ago
YananLi18 / SNESet
View on GitHub
☆11Jun 3, 2024Updated 2 years ago
stoneMo / EZ-VSL
View on GitHub
Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)
☆42Oct 2, 2022Updated 3 years ago
Audio-WestlakeU / SAR-SSL
View on GitHub
A python implementation of “Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Mult…
☆40Oct 11, 2024Updated last year
PrinzOwO / libgtp5gnl
View on GitHub
libgtp5gnl - netlink library for Linux kernel module 5G GTP-U
☆16Jun 30, 2021Updated 5 years ago
MengboLi / MS-SENet
View on GitHub
☆11Jul 16, 2024Updated 2 years ago
StevenHickson / CreateNormals
View on GitHub
☆11Nov 22, 2019Updated 6 years ago
stoneMo / SLAVC
View on GitHub
Official Codebase of "A Closer Look at Weakly-Supervised Audio-Visual Source Localization" (NeurIPS 2022)
☆22Dec 6, 2022Updated 3 years ago
qiuqiangkong / sampleRNN_acoustic_scene_generation
View on GitHub
☆14Apr 18, 2019Updated 7 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
zoezou2015 / abs_pretraining
View on GitHub
☆10Apr 28, 2021Updated 5 years ago
fakufaku / doamm
View on GitHub
Implementation of algorithms for refinement of direction of arrival estimators by optimization
☆15Jun 2, 2021Updated 5 years ago
prerak23 / Dir_SrcMic_DOA
View on GitHub
Codebase of the submitted work in ICASSP 2023
☆14Nov 30, 2022Updated 3 years ago
XZWY / SpatialCodec
View on GitHub
Implementation of SpatialCodec.
☆71Sep 23, 2023Updated 2 years ago
yangyi0818 / DOA-estimation-with-a-stacked-self-attention-network
View on GitHub
A stacked self-attention network for two-dimensional direction-of-arrival estimation in hands-free speech communication
☆12Sep 12, 2024Updated last year
IFICL / SLfM
View on GitHub
Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
☆43Jul 16, 2026Updated last week
hxixixh / mix-and-localize
View on GitHub
☆23Mar 20, 2024Updated 2 years ago
DA-MUSIC / DR-MUSIC_ICASSP23
View on GitHub
☆14May 27, 2023Updated 3 years ago
popcornell / OSDC
View on GitHub
☆18Jan 26, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
jinxiang-liu / anno-free-AVS
View on GitHub
Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"
☆38Oct 11, 2024Updated last year
chzhang18 / StereoEchoes
View on GitHub
Stereo Depth Estimation with Echoes at ECCV 2022
☆10Sep 20, 2022Updated 3 years ago
jijunkai / Transformer_Music
View on GitHub
☆15Sep 26, 2023Updated 2 years ago
yyysjz1997 / Awesome-AudioVision-Multimodal
View on GitHub
A list of current Audio-Vision Multimodal with awesome resources (paper, application, data, review, survey, etc.).
☆34Oct 11, 2023Updated 2 years ago
manman1995 / Mutual-Information-driven-Pan-sharpening
View on GitHub
Mutual Information-driven Pan-sharpening
☆16Jun 20, 2023Updated 3 years ago
SAGNIKMJR / move2hear-active-AV-separation
View on GitHub
Code and datasets for 'Move2Hear: Active Audio-Visual Source Separation' (ICCV 2021)
☆16Jun 17, 2026Updated last month
UbiquitousLearning / Benchmark-On-Device-Training
View on GitHub
Our unique contributions are in tools/train/benchmark.
☆22Apr 14, 2025Updated last year
marmoi / dcase2021_task1a_baseline
View on GitHub
☆14Jun 9, 2021Updated 5 years ago
whl97 / LS-Score
View on GitHub
☆15Nov 24, 2020Updated 5 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
jacke121 / pytorch-ssim
View on GitHub
pytorch structural similarity (SSIM) loss
☆18Dec 12, 2018Updated 7 years ago
aiff22 / MicroISP
View on GitHub
☆18Aug 23, 2025Updated 11 months ago
Stella-IT / XenGarden
View on GitHub
XenGarden, an Object-oriented Python XenAPI Wrapper for managing Citrix Hypervisor and XCP-ng
☆12Nov 14, 2022Updated 3 years ago
jaeyeonkim99 / visage
View on GitHub
Official implementation of "ViSAGe: Video-to-Spatial AUdio Generation" (ICLR 2025)
☆47Sep 10, 2025Updated 10 months ago
mispchallenge / MISP2021-AVSR
View on GitHub
repository for paper "Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis"
☆18Jun 17, 2022Updated 4 years ago
stoneMo / AVGN
View on GitHub
Official implementation for AVGN
☆42Mar 24, 2023Updated 3 years ago
l3das / L3DAS23
View on GitHub
Official repository supporting the L3DAS23 IEEE ICASSP Grand Challenge
☆16Feb 10, 2023Updated 3 years ago