Baseline for DCASE 2024 Task 9: "Language-Queried Audio Source Separation"
☆26Mar 27, 2024Updated 2 years ago
Alternatives and similar repositories for dcase2024_task9_baseline
Users that are interested in dcase2024_task9_baseline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- baseline for IEEE ICME 2024 GC: Semi-supervised Acoustic Scene Classification under Domain Shift☆18Mar 16, 2024Updated 2 years ago
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆42Oct 13, 2023Updated 2 years ago
- Official implementation for FlowSep☆75Jan 2, 2025Updated last year
- Official data preparation and metric evaluation scripts for the Interspeech 2025 URGENT challenge.☆84May 21, 2025Updated 11 months ago
- ☆12Nov 7, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆118Jan 28, 2026Updated 3 months ago
- A Diffusion Probabilistic Model for Target Sound Extraction☆40Sep 27, 2024Updated last year
- Discogs-VI dataset and code☆21Dec 13, 2024Updated last year
- WildDESED: A LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection☆18Nov 19, 2024Updated last year
- Single channel speech source separation by diffusion process (ICASSP 2023)☆126Mar 15, 2024Updated 2 years ago
- Prediction of sound event bounding boxes (SEBBs)☆34Aug 2, 2024Updated last year
- Implementation for "Music Enhancement via Image Translation and Vocoding"☆54Apr 28, 2022Updated 4 years ago
- [ICLR 2025] Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes☆71Oct 8, 2025Updated 6 months ago
- ☆30Apr 22, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆23Mar 19, 2025Updated last year
- ☆12Mar 11, 2025Updated last year
- ☆216Dec 5, 2024Updated last year
- ☆26Mar 20, 2024Updated 2 years ago
- ☆68Aug 16, 2023Updated 2 years ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆201Dec 13, 2024Updated last year
- Official Repository for paper "Ambisonizer: Neural Upmixing as Spherical Harmonics Generation"☆16May 27, 2024Updated last year
- ☆28Mar 28, 2024Updated 2 years ago
- The source code of Tim-TSENet☆15Apr 22, 2022Updated 4 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Boosting Self-Supervised Embeddings for Speech Enhancement☆47Jun 23, 2022Updated 3 years ago
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆215Sep 19, 2024Updated last year
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models☆22Jul 10, 2024Updated last year
- An official implementation of the ICASSP 2024 paper: Dual-Path TFC-TDF UNet for Music Source Separation☆105Mar 19, 2024Updated 2 years ago
- [EMNLP 2024] ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers☆125Mar 20, 2025Updated last year
- ☆87May 21, 2023Updated 2 years ago
- ☆33Dec 23, 2025Updated 4 months ago
- Solos: A Dataset for Audio-Visual Music Analysis☆24Feb 17, 2023Updated 3 years ago
- ☆14Jan 2, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Query-conditioned target sound extraction model☆30Mar 25, 2025Updated last year
- Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.☆248Mar 7, 2025Updated last year
- Data simulation scripts for paper "Target Sound Extraction with Variable Cross-modality Clues"☆17May 19, 2023Updated 2 years ago
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆101Jul 24, 2024Updated last year
- Official data preparation scripts for the URGENT 2024 Challenge☆88May 21, 2025Updated 11 months ago
- Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval…☆21Feb 1, 2023Updated 3 years ago
- Readability-aware automatic lyrics transcription (ALT) evaluation toolkit☆44Aug 29, 2024Updated last year