KoljaB/WhoSpeaks

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/KoljaB/WhoSpeaks)

KoljaB / WhoSpeaks

Efficient approach to speaker diarization using voice characteristics extraction

☆109

Alternatives and similar repositories for WhoSpeaks

Users that are interested in WhoSpeaks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

maximus-choi / Utterr
View on GitHub
Real-time speaker diarization using straightforward, intuitive logic - High accuracy thanks to SpeechBrain/Pyannote-WeSpeaker models
☆30May 7, 2026Updated 2 months ago
KoljaB / Linguflex
View on GitHub
Command Your World with Voice
☆811Jun 17, 2025Updated last year
Yifei-ZHAO96 / Tr-VAD
View on GitHub
Tr-VAD: An Efficient Transformer based Voice Activity Detection Model
☆18Aug 1, 2024Updated last year
KoljaB / LocalEmotionalAIVoiceChat
View on GitHub
Simulates talk with an AI that can express emotions
☆88Apr 4, 2026Updated 3 months ago
juanmc2005 / diart
View on GitHub
A python package to build AI-powered real-time audio applications
☆2,003Jun 19, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
icynic / desktop-live-caption
View on GitHub
Transcribe desktop audio/computer audio in real-time and locally (Streaming ASR), using TorchAudio and Emformer-RNNT model for inference,…
☆14May 7, 2024Updated 2 years ago
clement-pages / gryannote
View on GitHub
Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.
☆71Apr 22, 2026Updated 2 months ago
narcotic-sh / senko
View on GitHub
Very fast, accurate speaker diarization
☆284Jun 11, 2026Updated last month
nikhilraghav29 / diarizen-tutorial
View on GitHub
DiariZen Explained: A Tutorial for the Open Source State-of-the-Art Speaker Diarization Pipeline.
☆21Apr 24, 2026Updated 2 months ago
KoljaB / stream2sentence
View on GitHub
Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.
☆82Updated this week
KoljaB / RealtimeTTS
View on GitHub
Converts text to speech in realtime
☆3,995May 31, 2026Updated last month
camenduru / AdvancedLivePortrait-jupyter
View on GitHub
☆11Sep 28, 2024Updated last year
will-rice / denoisers
View on GitHub
Simple PyTorch Denoisers for Waveform Audio
☆41Apr 4, 2026Updated 3 months ago
thomasmol / cog-whisper-diarization
View on GitHub
Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote
☆237Jun 11, 2026Updated last month
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
harmlessman / PAFTS
View on GitHub
PAFTS : Library That Preprocessing Audio For TTS.
☆27Nov 15, 2024Updated last year
yoheinakajima / autofinetune
View on GitHub
auto fine tune of models with synthetic data
☆78Feb 14, 2024Updated 2 years ago
FrenchKrab / datasets-pyannote
View on GitHub
Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)
☆15Oct 22, 2025Updated 8 months ago
jczhang02 / MUSIC_dataset_script
View on GitHub
This repo contains script to download MUSIC dataset from youtube
☆12Jan 19, 2024Updated 2 years ago
BriansIDP / WhisperBiasing
View on GitHub
☆88Jul 31, 2025Updated 11 months ago
BGPView / browser-extension
View on GitHub
Multi Browser Kango Extension for BGPView - A DNS and BGP network visualizer
☆10May 16, 2017Updated 9 years ago
TaoRuijie / MFV-KSD
View on GitHub
Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)
☆22Jul 25, 2024Updated last year
SELMA-project / ml4audio
View on GitHub
audio, NLP, ML with huggingface, nvidia/nemo, speechbrain
☆11Sep 4, 2023Updated 2 years ago
KoljaB / LocalAIVoiceChat
View on GitHub
Local AI talk with a custom voice based on Zephyr 7B model. Uses RealtimeSTT with faster_whisper for transcription and RealtimeTTS with C…
☆726Jun 17, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
0xMesto / UnofficialClaude
View on GitHub
☆14Aug 22, 2024Updated last year
MahmoudAshraf97 / whisper-diarization
View on GitHub
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
☆5,603Feb 23, 2026Updated 4 months ago
EnricoCecchini / Narrator-AI
View on GitHub
Svelte app to generate audiobooks using XTTS
☆12Feb 13, 2024Updated 2 years ago
Hunterhuan / sphereface2_speaker_verification
View on GitHub
Exploring Binary Classification Loss for Speaker Verification
☆18Jul 18, 2023Updated 3 years ago
leohuang2013 / pyannote-audio_speaker-diarization_cpp
View on GitHub
C++ version of pyannote audio speaker diarizaiton pipeline
☆22Feb 14, 2024Updated 2 years ago
deepgram-starters / flask-live-transcription
View on GitHub
Get started using Deepgram's Live Transcription with this Flask demo app
☆46Apr 11, 2026Updated 3 months ago
astrologos / tradinggym
View on GitHub
A highly-customizable OpenAI gym environment to train & evaluate RL agents trading stocks and crypto.
☆20Jun 6, 2023Updated 3 years ago
kristianfreeman / region-workers-example
View on GitHub
Example Cloudflare Workers project showing how to return HTML responses with enriched region data
☆14May 3, 2021Updated 5 years ago
pirxus / personalVAD
View on GitHub
An unofficial implementation of the Personal VAD speaker-conditioned voice activity detection method. Bachelor's thesis project.
☆89Sep 22, 2022Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Jordain / Comfy_Image_Workshop
View on GitHub
A scalable solution that simplifies the integration of ComfyUI for developers
☆11Jul 15, 2024Updated 2 years ago
riteshhere / Speaker_diarization
View on GitHub
Speech Diarization for scrum automation
☆111Jul 27, 2023Updated 2 years ago
dimtzionas / HandObjectInteractionIJCV16_HandMotionViewer
View on GitHub
Hand MoCap 3d viewer for the IJCV'16 paper "Capturing Hands in Action using Discriminative Salient Points and Physics Simulation"
☆11May 19, 2016Updated 10 years ago
Wordcab / wordcab-transcribe
View on GitHub
💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.
☆220Oct 30, 2024Updated last year
xiaoxiaomiao323 / MSA
View on GitHub
☆16Feb 19, 2026Updated 5 months ago
Intersection98 / ComfyUI_MX_post_processing-nodes
View on GitHub
☆13May 23, 2024Updated 2 years ago
Barbariskaa / Biba
View on GitHub
☆19Jul 1, 2023Updated 3 years ago