Ming-er / LGC-SED
☆12Updated last year
Alternatives and similar repositories for LGC-SED:
Users that are interested in LGC-SED are comparing it to the libraries listed below
- official implementation of MGA-CLAP (ACM MM 2024)☆12Updated 4 months ago
- ☆17Updated 3 months ago
- Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection☆16Updated 6 months ago
- This repository collects papers related to Speech Tokenizer.☆15Updated 4 months ago
- MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection☆8Updated 7 months ago
- [INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by In…☆44Updated 11 months ago
- Baseline method for audio-visual sound event localization and detection task of DCASE 2023 challenge☆50Updated last year
- The dataset and baseline code for Text-to-Audio Grounding (TAG)☆42Updated last month
- ☆11Updated last year
- Implementation of our paper 'On Metric Learning For Audio-Text Cross-Modal Retrieval'☆43Updated 2 years ago
- ☆23Updated 4 months ago
- 🦇 Encoder of BAT (Learning to Reason about Spatial Sounds with Large Language Models)☆41Updated 2 weeks ago
- Code for paper "Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition"☆19Updated last year
- Download audioset data super fastly with youtube-dl, ffmpeg and python multiprocessing☆35Updated 7 months ago
- Source code for the paper 'Audio Captioning Transformer'☆53Updated 3 years ago
- baseline for IEEE ICME 2024 GC: Semi-supervised Acoustic Scene Classification under Domain Shift☆17Updated 11 months ago
- This package aims at simplifying the download of the AudioCaps dataset.☆31Updated last year
- WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection☆12Updated 3 months ago
- ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation☆23Updated 11 months ago
- ☆33Updated last week
- Sound Event Detection (SED) paper collection☆13Updated 8 months ago
- The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"☆19Updated last year
- ☆22Updated 11 months ago
- ☆25Updated last year
- The official repo for Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation☆29Updated last week
- Official Codebase of "A Closer Look at Weakly-Supervised Audio-Visual Source Localization" (NeurIPS 2022)☆17Updated 2 years ago
- Code for CVSSP submission to DCASE 2021 Task 6☆35Updated 2 years ago
- A dataset for Audio-Visual Sound Event Detection in Movies☆27Updated 2 years ago