Labbeti / conette-audio-captioning
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
☆14Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for conette-audio-captioning
- For students who would like to apply for RA, PhD, postdoc in audio research.☆24Updated last month
- Boosting Self-Supervised Embeddings for Speech Enhancement☆45Updated 2 years ago
- Learning differentiable temporal resolution on time-series data.☆33Updated 2 years ago
- Official implementation for our paper "Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations"☆31Updated 5 months ago
- Code for the Interspeech 2024 paper "MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting"☆16Updated 3 months ago
- An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification☆15Updated 2 months ago
- Generation scripts for EARS-WHAM and EARS-Reverb☆23Updated 2 months ago
- ☆14Updated last month
- ☆17Updated last month
- ☆27Updated last year
- ICASSP2025Dynamic Embedding Causal Target Speech Extraction☆29Updated last month
- A repo containing download guidance and corresponding scripts of the VoxBlink dataset.☆23Updated 7 months ago
- Exploring Binary Classification Loss for Speaker Verification☆14Updated last year
- A Diffusion Probabilistic Model for Target Sound Extraction☆35Updated last month
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆67Updated 2 weeks ago
- ☆18Updated 2 years ago
- This repository contains the code of the CP JKU submission to DCASE23 Task 1 "Low-complexity Acoustic Scene Classification"☆23Updated last year
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆32Updated last year
- A toolkit dedicate for speech evaluation.☆18Updated last month
- Data simulation scripts for paper "Target Sound Extraction with Variable Cross-modality Clues"☆14Updated last year
- ☆49Updated 6 months ago
- A python implementation of “Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Mult…☆29Updated last month
- The source code of Tim-TSENet☆12Updated 2 years ago
- COG-MHEAR Audio-Visual Speech Enhancement Challenge☆34Updated 8 months ago
- DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning☆47Updated 10 months ago
- ☆15Updated 2 years ago
- Code and data recipes for the paper: Heterogeneous Target Speech Separation☆39Updated last year
- Code for CVSSP submission to DCASE 2021 Task 6☆35Updated 2 years ago
- NOMAD: Non-Matching Audio Distance (ICASSP 2024)☆24Updated last month
- ARCH: Audio Representations benCHmark☆38Updated 2 months ago