HHousen/speaker-change-detection

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/HHousen/speaker-change-detection)

HHousen / speaker-change-detection

Speaker change detection using SincNet and an LSTM/Transformer

☆57

Alternatives and similar repositories for speaker-change-detection

Users that are interested in speaker-change-detection are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

alumae / online_speaker_change_detector
View on GitHub
Online streaming speaker change detection model in Pytorch
☆44Apr 14, 2023Updated 3 years ago
yucongzh / online_speaker_diarization
View on GitHub
☆15Jul 11, 2022Updated 4 years ago
liutaocode / AwesomeDiarizationDataset
View on GitHub
Both audio-only and audio-visual speaker diarization datasets are listed here.
☆16Feb 22, 2023Updated 3 years ago
philipperemy / speaker-change-detection
View on GitHub
Paper: https://arxiv.org/abs/1702.02285
☆64Dec 19, 2018Updated 7 years ago
FrenchKrab / datasets-pyannote
View on GitHub
Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)
☆15Oct 22, 2025Updated 9 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
artie-inc / artie-bias-corpus
View on GitHub
Artie Bias Corpus: an audio corpus + code for detecting demographic bias
☆20Jul 21, 2020Updated 6 years ago
Livefull / SphereDiar
View on GitHub
☆11May 4, 2020Updated 6 years ago
huggingface / diarizers
View on GitHub
☆327Jun 14, 2024Updated 2 years ago
fgnt / speaker_reassignment
View on GitHub
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
☆14Feb 5, 2025Updated last year
IDRnD / VoxTube
View on GitHub
The VoxTube dataset official repository
☆71Feb 14, 2024Updated 2 years ago
llm-lab-org / CLASP
View on GitHub
CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval
☆13Jun 27, 2025Updated last year
egruttadauria98 / SSpaVAlDo
View on GitHub
☆37Jan 6, 2026Updated 6 months ago
k2-fsa / sherpa-mlx
View on GitHub
sherpa with mlx
☆15Aug 2, 2025Updated 11 months ago
samsad35 / code-ancogen
View on GitHub
[ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder
☆14Mar 11, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
pengzhendong / ngram-punctuator
View on GitHub
An N-gram punctuator for Chinese and English.
☆18Oct 14, 2025Updated 9 months ago
marianne-m / brouhaha-vad
View on GitHub
Predicts the level of noise and reverberation on your audiofiles
☆191May 23, 2026Updated 2 months ago
SSTC-Challenge / SSTC2024_baseline_system
View on GitHub
☆12Jun 14, 2024Updated 2 years ago
tango4j / Auto-Tuning-Spectral-Clustering
View on GitHub
This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"
☆125Apr 8, 2022Updated 4 years ago
qinxiaoyi / Simple-Attention-Module-based-Speaker-Verification-with-Iterative-Noisy-Label-Detection
View on GitHub
☆12Jun 14, 2022Updated 4 years ago
Audio-WestlakeU / FS-EEND
View on GitHub
The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …
☆183May 7, 2026Updated 2 months ago
PyThaiNLP / thai-g2p-wiktionary-corpus
View on GitHub
Thai Grapheme to Phoneme (G2P) Wiktionary Corpus
☆13Jul 25, 2022Updated 4 years ago
chentuochao / Target-Conversation-Extraction
View on GitHub
This is the code and dataset repo for Interspeech 2024 paper "Target conversation extraction: Source separation using turn-taking dynamic…
☆58Aug 15, 2025Updated 11 months ago
lonzi / mrflow_dpo
View on GitHub
☆22Jan 3, 2026Updated 6 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
desh2608 / diarizer
View on GitHub
Clustering-based methods for overlapping diarization
☆84Jan 12, 2024Updated 2 years ago
koudounasalkis / voc2vec
View on GitHub
This repository contains the code for the paper "voc2vec: A Foundation Model for Non-Verbal Vocalization", accepted at ICASSP 2025.
☆58Apr 14, 2025Updated last year
taylorlu / ghostvlad-speaker
View on GitHub
An tensorflow implementation of ghostvlad for speaker recognition
☆15May 2, 2019Updated 7 years ago
YChenL / DS-TDNN
View on GitHub
Official implement of "Dual-stream Time-Delay Neural Network with Dynamic Global Filter for Speaker Verification" in PyTorch
☆41Aug 31, 2023Updated 2 years ago
lavendery / AudioComposer
View on GitHub
☆27Sep 10, 2025Updated 10 months ago
wq2012 / SimpleDER
View on GitHub
A lightweight library to compute Diarization Error Rate (DER).
☆62Jan 14, 2026Updated 6 months ago
DCGM / SoftCTC
View on GitHub
This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135
☆19Mar 7, 2023Updated 3 years ago
revsic / torch-nansy
View on GitHub
Torch implementation of NANSY, Neural Analysis and Synthesis, arXiv:2110.14513
☆64Feb 13, 2023Updated 3 years ago
cvqluu / simple_diarizer
View on GitHub
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
☆158May 2, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
huutuongtu / Lightvoc
View on GitHub
LIGHTVOC AN UPSAMPLING-FREE GAN VOCODER BASED ON CONFORMER AND INVERSE SHORT-TIME FOURIER TRANSFORM
☆18May 17, 2024Updated 2 years ago
WelkinYang / WaveODE
View on GitHub
An ODE-based generative neural vocoder using Rectified Flow
☆58Apr 29, 2023Updated 3 years ago
davidsvaughn / prompt-loss-weight
View on GitHub
code for Towards Data Science article on prompt-loss-weight
☆11Jun 4, 2025Updated last year
aqtq314 / VogenSVS
View on GitHub
☆15Apr 16, 2026Updated 3 months ago
LAION-AI / emotional-speech-annotations
View on GitHub
This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models
☆35Oct 13, 2024Updated last year
alimehotline / FTAD
View on GitHub
☆17Apr 12, 2021Updated 5 years ago
k2-fsa / kaldifst
View on GitHub
Python wrapper for OpenFST and its extensions from Kaldi. Also support reading/writing ark/scp files
☆56Apr 9, 2026Updated 3 months ago