Jose-Sabater/whisper-pyannote

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Jose-Sabater/whisper-pyannote)

Jose-Sabater / whisper-pyannote

Whisper from OpenAi and diarization with Pyannote

☆52

Alternatives and similar repositories for whisper-pyannote

Users that are interested in whisper-pyannote are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Jose-Sabater / AI-Assistant-Whisper-ChatGPT-Notion
View on GitHub
Automated notes. From video to summary in Notion. Using moviepy, whisper, chatgpt and notionapi
☆19Apr 2, 2024Updated 2 years ago
mirix / approaches-to-diarisation
View on GitHub
A testing repo to share code and thoughts on diarisation
☆58Mar 26, 2024Updated 2 years ago
NavodPeiris / speechlib
View on GitHub
Speechlib is a library that unifies speaker diarization, transcription and speaker recognition in a single pipeline to create transcripts…
☆266Apr 19, 2026Updated 3 months ago
ndkgit339 / spe-dss
View on GitHub
Speech Parameter Estimation Using Differentiable Speech Synthesizer
☆43May 9, 2023Updated 3 years ago
daanzu / py-silero-vad-lite
View on GitHub
Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies
☆17Nov 25, 2024Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
reppy4620 / x-vits
View on GitHub
☆14Aug 1, 2025Updated 11 months ago
lars76 / fastspeech2-clean
View on GitHub
Clean and modernized implementation of FastSpeech2/LightSpeech using IPA
☆18Aug 16, 2024Updated last year
audiodemo / voice-conversion
View on GitHub
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Aug 18, 2023Updated 2 years ago
5Hyeons / StyleTTS2-Vocos
View on GitHub
StyleTTS2 + Vocos as a Decoder
☆13Mar 24, 2025Updated last year
mct10 / CoBERT
View on GitHub
Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
☆48Nov 8, 2023Updated 2 years ago
DDATT / Vits2-onnx-cpp
View on GitHub
Simple inference for Vits2 TTS Using ONNXRUNTIME and espeak-ng on C++
☆19Apr 17, 2024Updated 2 years ago
v-nhandt21 / MusicVoiceConversion
View on GitHub
Sing any popular song with your voice
☆11Jul 10, 2022Updated 4 years ago
MaxMax2016 / max-vc
View on GitHub
singing voice conversion without f0
☆23May 10, 2023Updated 3 years ago
ttslr / MonTTS
View on GitHub
☆16Dec 23, 2021Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
philgzl / brever
View on GitHub
Speech enhancement in noisy and reverberant environments using deep neural networks
☆23Oct 10, 2025Updated 9 months ago
JaesungHuh / VoxSRC2022
View on GitHub
VoxSRC2022 workshop development kit
☆19Jul 21, 2022Updated 4 years ago
ex3ndr / supervoice-hybrid
View on GitHub
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Aug 5, 2024Updated last year
dafyddg / RFA
View on GitHub
Implementation of the Rhythm Formant Analysis methodology for identifying speech rhythms and rhythm variation in the low frequency spectr…
☆17Apr 27, 2023Updated 3 years ago
tonnetonne814 / SiFi-VITS2-44100-Ja
View on GitHub
DDPM-based Pitch Generation and Pitch Controllable Voice Synthesis.
☆55Sep 25, 2023Updated 2 years ago
manthan98 / Cochlear-Implant-Processor
View on GitHub
Cochlear implant signal processing
☆10Jun 24, 2021Updated 5 years ago
ZhaoF-i / SDAEC
View on GitHub
☆19Jan 6, 2025Updated last year
p0p4k / Matcha-TTS-2
View on GitHub
E2E TTS using Conditional Flow Matching (Experimental*)
☆71Nov 10, 2023Updated 2 years ago
idiap / zff_vad
View on GitHub
Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering
☆23Oct 19, 2023Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
DakeQQ / STFT-ISTFT-ONNX
View on GitHub
Export the STFT or ISTFT process in ONNX format.
☆47Jun 6, 2026Updated last month
lsq960124 / StyleBERT
View on GitHub
Implementation of the paper: StyleBERT: Text-Audio Sentiment Analysis with Bi-directional Style Enhancement
☆14Apr 10, 2023Updated 3 years ago
flageval-baai / ChildMandarin
View on GitHub
[ACL 2025 Main] A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5
☆59Mar 19, 2025Updated last year
WangHelin1997 / Automatic_Speech_Annotator
View on GitHub
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Jun 14, 2024Updated 2 years ago
dy / image-output
View on GitHub
Output image to a file, stream, canvas, console, buffer or any other destination
☆17Jan 18, 2025Updated last year
Blucknote / Kandinsky-advanced-notebooks
View on GitHub
Notebooks with additional features to run Kandinsky
☆14May 15, 2023Updated 3 years ago
36Kr-Mobile / KRKit
View on GitHub
a open source iOS framework
☆14Jun 17, 2015Updated 11 years ago
dwsjoan / SRAS
View on GitHub
Speech Recognition and Simple AI Summary：可用于本地语音转文字、说话人分割及简易的AI总结，搭配web端操作界面。
☆11Jul 22, 2024Updated 2 years ago
hcy71o / AutoVocoder
View on GitHub
Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing
☆71Dec 2, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
gurteshwar / freeswitch-esl-python
View on GitHub
Auto generated swig python module with a binary compnent
☆11Apr 19, 2012Updated 14 years ago
sarulab-speech / multi-speaker-dgp
View on GitHub
Official implementation of DGP-based multi-speaker speech synthesis with PyTorch
☆24Mar 23, 2021Updated 5 years ago
PanagiotisP / svs-multiband
View on GitHub
Code for the paper "MULTI-BAND MASKING FOR WAVEFORM-BASED SINGING VOICE SEPARATION" that was accepted on EUSIPCO2022
☆15Jun 18, 2022Updated 4 years ago
vliu15 / adversarial-tts
View on GitHub
End-to-end Text-to-Speech with Generative Adversarial Networks
☆20Feb 6, 2021Updated 5 years ago
Mastering-Python-GT / Transcription-diarization-whisper-pyannote
View on GitHub
Transcription and diarization (speaker identification)
☆33May 31, 2023Updated 3 years ago
b-sigpro / sed-hsmm
View on GitHub
Onset-and-Offset-Aware Sound Event Detection
☆21Feb 10, 2025Updated last year
linto-ai / linto-diarization
View on GitHub
Speaker diarization service
☆27Jul 2, 2026Updated 3 weeks ago