Speaker change detection using SincNet and an LSTM/Transformer
☆57May 26, 2025Updated last year
Alternatives and similar repositories for speaker-change-detection
Users that are interested in speaker-change-detection are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Online streaming speaker change detection model in Pytorch☆43Apr 14, 2023Updated 3 years ago
- ☆15Jul 11, 2022Updated 3 years ago
- Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)☆15Oct 22, 2025Updated 7 months ago
- Both audio-only and audio-visual speaker diarization datasets are listed here.☆15Feb 22, 2023Updated 3 years ago
- Paper: https://arxiv.org/abs/1702.02285☆64Dec 19, 2018Updated 7 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆11May 4, 2020Updated 6 years ago
- Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment☆14Feb 5, 2025Updated last year
- Artie Bias Corpus: an audio corpus + code for detecting demographic bias☆20Jul 21, 2020Updated 5 years ago
- ☆325Jun 14, 2024Updated last year
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 11 months ago
- The VoxTube dataset official repository☆71Feb 14, 2024Updated 2 years ago
- ☆36Jan 6, 2026Updated 4 months ago
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆14Mar 11, 2025Updated last year
- Predicts the level of noise and reverberation on your audiofiles☆186Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"☆125Apr 8, 2022Updated 4 years ago
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆179May 7, 2026Updated 3 weeks ago
- ☆12Jun 14, 2024Updated last year
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- ☆12Jun 14, 2022Updated 3 years ago
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆65Nov 5, 2025Updated 6 months ago
- An tensorflow implementation of ghostvlad for speaker recognition☆15May 2, 2019Updated 7 years ago
- This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamic …☆58Aug 15, 2025Updated 9 months ago
- Thai Grapheme to Phoneme (G2P) Wiktionary Corpus☆13Jul 25, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A lightweight audio codec based on a single quantizer☆34Sep 4, 2025Updated 8 months ago
- Clustering-based methods for overlapping diarization☆83Jan 12, 2024Updated 2 years ago
- Official implement of "Dual-stream Time-Delay Neural Network with Dynamic Global Filter for Speaker Verification" in PyTorch☆41Aug 31, 2023Updated 2 years ago
- code for Towards Data Science article on prompt-loss-weight☆11Jun 4, 2025Updated 11 months ago
- A lightweight library to compute Diarization Error Rate (DER).☆62Jan 14, 2026Updated 4 months ago
- ☆27Sep 10, 2025Updated 8 months ago
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆23May 19, 2026Updated last week
- LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM☆18May 17, 2024Updated 2 years ago
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Mar 7, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆35Oct 23, 2025Updated 7 months ago
- An ODE-based generative neural vocoder using Rectified Flow☆58Apr 29, 2023Updated 3 years ago
- Torch implementation of NANSY, Neural Analysis and Synthesis, arXiv:2110.14513☆64Feb 13, 2023Updated 3 years ago
- ☆15Apr 16, 2026Updated last month
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- Forced alignment decoder for Whisper.☆16Mar 13, 2024Updated 2 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆157May 2, 2024Updated 2 years ago