Labbeti / conette-audio-captioning
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
☆11Updated last month
Related projects: ⓘ
- Official data preparation scripts for the URGENT 2024 Challenge☆54Updated last month
- Boosting Self-Supervised Embeddings for Speech Enhancement☆42Updated 2 years ago
- COG-MHEAR Audio-Visual Speech Enhancement Challenge☆32Updated 5 months ago
- Data simulation scripts for paper "Target Sound Extraction with Variable Cross-modality Clues"☆13Updated last year
- NOMAD is a fully unsupervised non-matching reference audio quality metric☆23Updated 3 months ago
- Training data simulation☆38Updated 4 months ago
- Exploring Binary Classification Loss for Speaker Verification☆14Updated last year
- ☆31Updated 3 years ago
- Query-conditioned target sound extraction model☆14Updated 3 months ago
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆32Updated 11 months ago
- A repo containing download guidance and corresponding scripts of the VoxBlink dataset.☆20Updated 5 months ago
- ☆47Updated 4 months ago
- Pytorch implementation of Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Pro…☆18Updated 9 months ago
- ICASSP 2023: 'Speaker recognition with two-step multi-modal deep cleansing'☆30Updated last year
- This is official repository of new SOTA diffusion models based method for speech enhancement☆28Updated last month
- Official implementation for our paper "Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations"☆22Updated 3 months ago
- For students who would like to apply for RA, PhD, postdoc in audio research.☆22Updated 11 months ago
- A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"☆42Updated this week
- Learning differentiable temporal resolution on time-series data.☆33Updated last year
- ☆18Updated 2 years ago
- ☆44Updated 9 months ago
- Unofficial SoundStream implementation of Pytorch with training code and 16kHz pretrained checkpoint☆54Updated last year
- A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK☆60Updated 2 years ago
- Open implementation of UNIVERSE and UNIVERSE++ diffusion-based speech enhancement models.☆66Updated 3 weeks ago
- ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER SPEECH RECOGNITION USING SPEAKER EMBEDDINGS☆26Updated last year
- ☆22Updated 2 years ago
- Pytorch implementation of subband decomposition☆88Updated 2 years ago
- This repository includes the code to reproduce our paper "RawBoost: A Raw Data Boosting and Augmentation Method applied to Automatic Spea…☆49Updated 11 months ago
- A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization☆62Updated 2 weeks ago
- ☆33Updated last year