yinruiqing/pyannote-whisper

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yinruiqing/pyannote-whisper)

yinruiqing / pyannote-whisper

☆676

Alternatives and similar repositories for pyannote-whisper

Users that are interested in pyannote-whisper are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Jose-Sabater / whisper-pyannote
View on GitHub
Whisper from OpenAi and diarization with Pyannote
☆52Jan 7, 2024Updated 2 years ago
pyannote / pyannote-audio
View on GitHub
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker…
☆10,351Updated this week
Majdoddin / nlp
View on GitHub
☆490Sep 10, 2025Updated 10 months ago
MahmoudAshraf97 / whisper-diarization
View on GitHub
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
☆5,614Feb 23, 2026Updated 5 months ago
lablab-ai / Whisper-transcription_and_diarization-speaker-identification-
View on GitHub
How to use OpenAIs Whisper to transcribe and diarize audio files
☆377Oct 12, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
m-bain / whisperX
View on GitHub
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆23,320Jul 13, 2026Updated 2 weeks ago
pengzhendong / pyannote-onnx
View on GitHub
ONNX Inference of Pyannote Segmentation
☆99Dec 23, 2024Updated last year
juanmc2005 / diart
View on GitHub
A python package to build AI-powered real-time audio applications
☆2,007Jun 19, 2026Updated last month
hayeong0 / Diff-HierVC
View on GitHub
Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Pr…
☆237Jul 3, 2024Updated 2 years ago
EtienneAb3d / WhisperHallu
View on GitHub
Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
☆351Nov 12, 2024Updated last year
X-E-Speech / X-E-Speech-code
View on GitHub
X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
☆112Apr 1, 2024Updated 2 years ago
huggingface / speechbox
View on GitHub
☆358Mar 17, 2024Updated 2 years ago
yucongzh / online_speaker_diarization
View on GitHub
☆15Jul 11, 2022Updated 4 years ago
choiHkk / Transformer-TTS-V2
View on GitHub
☆25Mar 6, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
wq2012 / awesome-diarization
View on GitHub
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
☆1,888Jul 7, 2026Updated 3 weeks ago
hcy71o / AutoVocoder
View on GitHub
Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing
☆71Dec 2, 2022Updated 3 years ago
miguelvalente / whisperer
View on GitHub
Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.
☆137Aug 14, 2023Updated 2 years ago
FrenchKrab / IS2023-powerset-diarization
View on GitHub
Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.
☆96Oct 18, 2023Updated 2 years ago
NavodPeiris / speechlib
View on GitHub
Speechlib is a library that unifies speaker diarization, transcription and speaker recognition in a single pipeline to create transcripts…
☆266Apr 19, 2026Updated 3 months ago
pyf98 / DPHuBERT
View on GitHub
INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"
☆118Jan 26, 2024Updated 2 years ago
p0p4k / pflowtts_pytorch
View on GitHub
Unofficial implementation of NVIDIA P-Flow TTS paper
☆228Dec 24, 2024Updated last year
Majdoddin / lexicaps
View on GitHub
Transcription and Diarization based on OpenAI's Whisper
☆25Sep 9, 2025Updated 10 months ago
p0p4k / Matcha-TTS-2
View on GitHub
E2E TTS using Conditional Flow Matching (Experimental*)
☆71Nov 10, 2023Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
shinhyeokoh / rwen
View on GitHub
☆14Jun 16, 2023Updated 3 years ago
joonson / voxconverse
View on GitHub
Spot the conversation: speaker diarisation in the wild
☆171Jul 26, 2022Updated 4 years ago
SYSTRAN / faster-whisper
View on GitHub
Faster Whisper transcription with CTranslate2
☆24,609Nov 19, 2025Updated 8 months ago
yuan1615 / AdaVocoder
View on GitHub
Adaptive Vocoder for Custom Voice
☆61Sep 22, 2022Updated 3 years ago
Aria-K-Alethia / laughter-synthesis
View on GitHub
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆77Jul 16, 2023Updated 3 years ago
BUTSpeechFIT / DiaPer
View on GitHub
☆69Feb 8, 2024Updated 2 years ago
linto-ai / whisper-timestamped
View on GitHub
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
☆2,832Sep 9, 2025Updated 10 months ago
DongKeon / Awesome-Speaker-Diarization
View on GitHub
Some comprehensive papers about speaker diarization
☆368Mar 24, 2026Updated 4 months ago
innnky / descript-audio-vae
View on GitHub
VAE modified from Descript Audio Codec, which replaces the RVQ with VAE
☆92Apr 2, 2024Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
liuhuang31 / g2pw_once
View on GitHub
G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…
☆14Dec 30, 2023Updated 2 years ago
alexgo84 / whisperx-server
View on GitHub
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
☆12Mar 10, 2023Updated 3 years ago
lifeiteng / naturalspeech3_facodec
View on GitHub
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3
☆254Apr 20, 2024Updated 2 years ago
asuni / PitchSqueezer
View on GitHub
A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation
☆38Jan 17, 2024Updated 2 years ago
uthree / ddsp-vocoder
View on GitHub
☆12Nov 7, 2024Updated last year
tonnetonne814 / PL-Bert-VITS2
View on GitHub
VITS2 using Phoneme-Level Japanese BERT
☆14Dec 17, 2023Updated 2 years ago
wenet-e2e / wespeaker
View on GitHub
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
☆1,370Jul 8, 2026Updated 3 weeks ago