A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIVITY DETECTION IN ADVERSE CONDITIONS"
☆21Nov 25, 2024Updated last year
Alternatives and similar repositories for SSL-PVAD
Users that are interested in SSL-PVAD are comparing it to the libraries listed below
Sorting:
- ☆11Nov 7, 2024Updated last year
- Accompanying repository for the paper "Automatic Music Mixing Using a Generative Model of Effect Embeddings"☆22Jan 18, 2026Updated last month
- Convert a mono channel recording into binaural playback with headphones and loudspeakers☆13Dec 6, 2023Updated 2 years ago
- unofficial implementation of "CPTNN: CROSS-PARALLEL TRANSFORMER NEURAL NETWORK FOR TIME-DOMAIN SPEECH ENHANCEMENT"☆15Nov 14, 2023Updated 2 years ago
- LoRA-based phoneme/prosody control for LLM-based TTS with no G2P - Lightweight adapter for edit and control the target language's phoneme…☆23Aug 14, 2025Updated 6 months ago
- ☆23Feb 2, 2022Updated 4 years ago
- Implementation of Sheffield entry for Clarity enhancement challenge.☆18Apr 19, 2022Updated 3 years ago
- Tr-VAD: An Efficient Transformer based Voice Activity Detection Model☆17Aug 1, 2024Updated last year
- Audio samples for the paper 'Phase-aware music super-resolution using generative adversarial networks'☆14May 15, 2020Updated 5 years ago
- Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)☆22Jul 25, 2024Updated last year
- This is the official implementation for εar-VAE model including inference and evaluation parts, more details coming soon...☆56Feb 13, 2026Updated 3 weeks ago
- An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control☆31Jan 13, 2026Updated last month
- Reimplementation of Miipher☆29Aug 16, 2023Updated 2 years ago
- Lightweight Speech Representation Learning for One-Shot Voice Conversion☆24Dec 12, 2024Updated last year
- ☆25Feb 28, 2023Updated 3 years ago
- Accompanying repository for the paper "DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions"☆38Oct 28, 2025Updated 4 months ago
- A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"☆60Sep 19, 2024Updated last year
- Ablation study of local spectral attention (LSA) for full-band speech enhancement (SE)☆28Sep 16, 2023Updated 2 years ago
- ☆32Jan 9, 2024Updated 2 years ago
- Comprehensive benchmark suite comparing pitch detection algorithms across multiple datasets.