Speaker change detection using SincNet and an LSTM/Transformer
☆58May 26, 2025Updated 10 months ago
Alternatives and similar repositories for speaker-change-detection
Users that are interested in speaker-change-detection are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Online streaming speaker change detection model in Pytorch☆43Apr 14, 2023Updated 2 years ago
- ☆15Jul 11, 2022Updated 3 years ago
- Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)☆15Oct 22, 2025Updated 5 months ago
- Both audio-only and audio-visual speaker diarization datasets are listed here.☆15Feb 22, 2023Updated 3 years ago
- Paper: https://arxiv.org/abs/1702.02285☆65Dec 19, 2018Updated 7 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆11May 4, 2020Updated 5 years ago
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆14Feb 5, 2025Updated last year
- Artie Bias Corpus: an audio corpus + code for detecting demographic bias☆20Jul 21, 2020Updated 5 years ago
- ☆324Jun 14, 2024Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 9 months ago
- The VoxTube dataset official repository☆71Feb 14, 2024Updated 2 years ago
- ☆36Jan 6, 2026Updated 2 months ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆13Mar 11, 2025Updated last year
- Predicts the level of noise and reverberation on your audiofiles☆184Jun 17, 2025Updated 9 months ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"☆125Apr 8, 2022Updated 3 years ago
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆172Dec 12, 2025Updated 3 months ago
- ☆11Jun 14, 2024Updated last year
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- ☆12Jun 14, 2022Updated 3 years ago
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆64Nov 5, 2025Updated 4 months ago
- An tensorflow implementation of ghostvlad for speaker recognition☆15May 2, 2019Updated 6 years ago
- This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamic…☆56Aug 15, 2025Updated 7 months ago
- Thai Grapheme to Phoneme (G2P) Wiktionary Corpus☆13Jul 25, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A lightweight audio codec based on a single quantizer☆34Sep 4, 2025Updated 6 months ago
- Clustering-based methods for overlapping diarization☆82Jan 12, 2024Updated 2 years ago
- Official implement of "Dual-stream Time-Delay Neural Network with Dynamic Global Filter for Speaker Verification" in PyTorch☆41Aug 31, 2023Updated 2 years ago
- code for Towards Data Science article on prompt-loss-weight☆11Jun 4, 2025Updated 9 months ago
- A lightweight library to compute Diarization Error Rate (DER).☆62Jan 14, 2026Updated 2 months ago
- ☆25Sep 10, 2025Updated 6 months ago
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆23Mar 17, 2026Updated last week
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated last year
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Mar 7, 2023Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆32Oct 23, 2025Updated 5 months ago
- An ODE-based generative neural vocoder using Rectified Flow☆58Apr 29, 2023Updated 2 years ago
- Torch implementation of NANSY, Neural Analysis and Synthesis, arXiv:2110.14513☆64Feb 13, 2023Updated 3 years ago
- ☆15Aug 22, 2025Updated 7 months ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- Forced alignment decoder for Whisper.☆15Mar 13, 2024Updated 2 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆155May 2, 2024Updated last year