liyidi / soundnet_localize_sound_source
soundnet and localize sound source
☆12Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for soundnet_localize_sound_source
- Codebase and Dataset for the paper: Learning to Localize Sound Source in Visual Scenes☆84Updated 3 years ago
- Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization, ACM MM 2020☆32Updated 4 years ago
- cross modal background suppression for audio-visual event localization☆35Updated 2 years ago
- ☆13Updated 4 months ago
- This repo aims to perform sound localization in complex audiovisual scenes, where there multiple objects making sounds.☆81Updated 3 years ago
- ☆39Updated 2 years ago
- Accepted by TMM 2022☆16Updated 2 years ago
- Baseline method for audio-visual sound event localization and detection task of DCASE 2023 challenge☆40Updated last year
- Deformable Speech Transformer (DST)☆27Updated 3 months ago
- ABAW3 (CVPRW): A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition☆35Updated 10 months ago
- SpeechFormer++ in PyTorch☆42Updated last year
- ☆32Updated last week
- Official repository supporting the L3DAS23 IEEE ICASSP Grand Challenge☆16Updated last year
- This repository contains the code for our ICASSP paper `Speech Emotion Recognition using Semantic Information` https://arxiv.org/pdf/2103…☆23Updated 3 years ago
- Baseline method for sound event localization task of DCASE 2022 challenge☆52Updated 2 years ago
- ☆14Updated last week
- Data preparation for separation☆72Updated 3 years ago
- This repo includes the official implementations of "Fine-tune the pretrained ATST model for sound event detection".☆101Updated last month
- ☆40Updated 4 years ago
- ☆30Updated last week
- Code for Speech Emotion Recognition with Co-Attention based Multi-level Acoustic Information☆129Updated 11 months ago
- Code for "Simple Pooling Front-ends for Efficient Audio Calssification", ICASSP 2023☆56Updated last year
- ☆28Updated 5 months ago
- Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.☆17Updated 2 years ago
- A LSTM for voice activity detection. In fact, this is a homework which I didn't expected.☆13Updated 3 years ago
- Deep-Learning-Based Audio-Visual Speech Enhancement and Separation☆203Updated last year
- Multi-modal Speech Emotion Recogniton on IEMOCAP dataset☆85Updated last year
- 3-D Convolutional Recurrent Neural Networks With Attention Model for Speech Emotion Recognition.☆35Updated 4 years ago
- Unofficial implementation of Dual-Path Transformer Network (DPTNet) for speech separation (Interspeech 2020)☆43Updated 3 years ago
- VGGSound: A Large-scale Audio-Visual Dataset☆291Updated 3 years ago