Saurabhbhati / DASS
☆11Updated 3 weeks ago
Alternatives and similar repositories for DASS
Users that are interested in DASS are comparing it to the libraries listed below
Sorting:
- MTDA-HSED: Mutual-Assistance Tuning and Dual-Branch Aggregating for Heterogeneous Sound Event Detection☆9Updated 7 months ago
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models☆18Updated 10 months ago
- ☆13Updated 4 months ago
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆12Updated 3 months ago
- Repository of the WACV'24 paper "Can CLIP Help Sound Source Localization?"☆28Updated 2 months ago
- Event Relation in Text-to-Audio (TTA) Generation☆17Updated 2 months ago
- ☆10Updated 7 months ago
- Query-conditioned target sound extraction model☆23Updated last month
- Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023☆12Updated last year
- Prediction of sound event bounding boxes (SEBBs)☆27Updated 9 months ago
- ☆24Updated 7 months ago
- ☆13Updated 2 years ago
- A benchmark for evaluating audio encoders on various audio tasks.☆19Updated this week
- Tools for the evaluation of audio captioning.☆16Updated 4 years ago
- Code and data recipes for the paper: Heterogeneous Target Speech Separation☆42Updated 2 years ago
- ☆23Updated 7 months ago
- ☆10Updated 5 months ago
- (ICASSP 2025) Learning Source Disentanglement in Neural Audio Codec☆33Updated this week
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆45Updated 7 months ago
- Whisper Speech Quality Assessment (WhiSQA)☆9Updated 5 months ago
- Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"☆25Updated last year
- Understanding Audio Features via Trainable Basis Functions☆9Updated 3 years ago
- [SLT'24] Mamba-based Decoder-Only Approach for Speech Recognition☆12Updated 5 months ago
- Official PyTorch implementation of 'VINP: Variational Bayesian Inference with Neural Speech Prior for Joint ASR-Effective Speech Dereverb…☆17Updated last month
- Official code of ElasticAST (Interspeech 2024 paper)☆30Updated 9 months ago
- TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages☆15Updated 11 months ago
- WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection☆14Updated 5 months ago
- [ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes☆35Updated last week
- ☆40Updated 2 years ago
- ☆16Updated last year