nianlonggu / WhisperSeg
Code for ICASSP 2024 paper WhisperSeg: Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection
☆29Updated 3 months ago
Alternatives and similar repositories for WhisperSeg:
Users that are interested in WhisperSeg are comparing it to the libraries listed below
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆82Updated last year
- ☆29Updated 8 months ago
- Open implementation of UNIVERSE and UNIVERSE++ diffusion-based speech enhancement models.☆91Updated 6 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆47Updated 8 months ago
- NOTSOFAR-1 Challenge: Distant Diarization and ASR☆50Updated last month
- This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamic…☆46Updated 5 months ago
- A Python implementation of the Speech Intelligibility Index☆41Updated last year
- Analysis of XLS-R for Speech Quality Assessment☆13Updated last month
- ☆61Updated last year
- Clustering-based methods for overlapping diarization☆77Updated last year
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆46Updated 6 months ago
- A library built for easier audio self-supervised training, downstream tasks evaluation☆112Updated 6 months ago
- ☆45Updated last month
- Repo for source code of EBEN: Extreme Bandwidth Extension Network☆72Updated last month
- Machine learning speaker characteristics☆33Updated 2 weeks ago
- A python library for voice activity detection (VAD) for speech/non-speech segmentation.☆86Updated 2 years ago
- High-Fidelity Neural Phonetic Posteriorgrams☆105Updated 3 weeks ago
- ☆31Updated 11 months ago
- ☆57Updated 4 years ago
- Learning differentiable temporal resolution on time-series data.☆36Updated 2 years ago
- ☆52Updated 9 months ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆82Updated 2 months ago
- Confidence interval computation for evaluation in machine learning using the bootstrapping approach☆77Updated 11 months ago
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Updated 2 years ago
- ☆64Updated last year
- Streaming Audiotransformers for online Audio tagging☆43Updated 9 months ago
- ☆52Updated last year
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆120Updated last week
- A self-supervised speech denoising strategy named Only-Noisy Training (ONT), which solves the speech denoising problem with only noisy au…☆65Updated 2 years ago
- ☆28Updated 3 years ago