KawhiZhao / Egocentric-Audio-Visual-Speaker-LocalizationLinks
Code for paper Audio Visual Speaker Localization from EgoCentric Views
☆11Updated last year
Alternatives and similar repositories for Egocentric-Audio-Visual-Speaker-Localization
Users that are interested in Egocentric-Audio-Visual-Speaker-Localization are comparing it to the libraries listed below
Sorting:
- ☆57Updated 2 years ago
- Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.☆18Updated 3 years ago
- Code for paper Learning Audio-Visual Dereverberation☆30Updated 3 years ago
- ☆13Updated last year
- Unsupervised domain adaptation for conversational speech enhancement using RemixIT☆54Updated 2 years ago
- Baseline method for audio-visual sound event localization and detection task of DCASE 2023 challenge☆55Updated 6 months ago
- ☆25Updated last year
- ☆39Updated 10 months ago
- This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".☆141Updated last month
- Accepted by TMM 2022☆17Updated 3 years ago
- This repo aims to perform sound localization in complex audiovisual scenes, where there multiple objects making sounds.☆87Updated 3 years ago
- ☆17Updated 10 months ago
- A python implementation of “Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Mult…☆37Updated 11 months ago
- The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization [INTERSPEECH2023 & TASLP2024]☆123Updated last week
- COG-MHEAR Audio-Visual Speech Enhancement Challenge☆43Updated 4 months ago
- ☆30Updated 2 years ago
- Code for "Simple Pooling Front-ends for Efficient Audio Calssification", ICASSP 2023☆57Updated 2 years ago
- A python implementation of “Learning Deep Direct-Path Relative Transfer Function for Binaural Sound Source Localization” [TASLP 2021]☆25Updated 2 years ago
- Learning differentiable temporal resolution on time-series data.☆36Updated 2 years ago
- ICASSP 2022: 'Self-supervised Speaker Recognition with Loss-gated Learning'☆91Updated 2 years ago
- Baseline method for sound event localization task of DCASE 2022 challenge☆56Updated 3 years ago
- Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling an…☆35Updated 2 years ago
- ☆126Updated 3 years ago
- SMS-WSJ: Spatialized Multi-Speaker Wall Street Journal database for multi-channel source separation and recognition☆120Updated last year
- The Easy Communications (EasyCom) dataset is a world-first dataset designed to help mitigate the *cocktail party effect* from an augmente…☆122Updated last year
- This is the public repository for eigenvector-based SALSA features for polyphonic sound event localization and detection.☆104Updated 3 years ago
- VoViT: Low Latency Graph-based Audio-Visual VoiceSeparation Transformer☆35Updated 2 years ago
- Neural Network based Sound Source Localization Models☆43Updated 2 years ago
- Noise-Aware Speech Separation with Contrastive Learning☆18Updated last year
- ☆15Updated 3 years ago