leohuang2013/pyannote-audio_overlapped-speech-detection_cpp

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/leohuang2013/pyannote-audio_overlapped-speech-detection_cpp)

leohuang2013 / pyannote-audio_overlapped-speech-detection_cpp

C++ version of pyannote audio overlapped speech detection pipeline

☆13

Alternatives and similar repositories for pyannote-audio_overlapped-speech-detection_cpp

Users that are interested in pyannote-audio_overlapped-speech-detection_cpp are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

leohuang2013 / pyannote-audio_speaker-diarization_cpp
View on GitHub
C++ version of pyannote audio speaker diarizaiton pipeline
☆22Feb 14, 2024Updated 2 years ago
reppy4620 / convnext_tts
View on GitHub
Unofficial implementation of ConvNeXt-TTS powered by lightning
☆18Oct 20, 2024Updated last year
lars76 / fastspeech2-clean
View on GitHub
Clean and modernized implementation of FastSpeech2/LightSpeech using IPA
☆18Aug 16, 2024Updated last year
p1an-lin-jung / wv_tts
View on GitHub
☆19Mar 22, 2024Updated 2 years ago
hbredin / pyannotebook
View on GitHub
🎹 pyannote + 🗒 notebook = pyannotebook
☆27Jun 12, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ex3ndr / supervoice-hybrid
View on GitHub
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆26Aug 5, 2024Updated last year
BUTSpeechFIT / diacorrect
View on GitHub
Error correction back-end for speaker diarization
☆18Sep 26, 2023Updated 2 years ago
FrenchKrab / datasets-pyannote
View on GitHub
Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)
☆15Oct 22, 2025Updated 9 months ago
uthree / ddsp-vocoder
View on GitHub
☆12Nov 7, 2024Updated last year
sushant-t / tts-trainer
View on GitHub
Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…
☆30May 27, 2023Updated 3 years ago
nikhilraghav29 / diarizen-tutorial
View on GitHub
DiariZen Explained: A Tutorial for the Open Source State-of-the-Art Speaker Diarization Pipeline.
☆22Apr 24, 2026Updated 3 months ago
koth / EmotiVoice.cpp
View on GitHub
cpp inference for EmotiVoice
☆16Jan 1, 2024Updated 2 years ago
xiaoxue1117 / speech-mamba-public
View on GitHub
☆15Nov 26, 2024Updated last year
pengzhendong / streaming-vocos
View on GitHub
Streaming Vocos
☆31Jun 10, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
kjw11 / Speaker-Aware-CTC
View on GitHub
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
☆22May 26, 2025Updated last year
poleval / 2021-punctuation-restoration
View on GitHub
PolEval 2021 Task 1
☆15Jun 28, 2022Updated 4 years ago
tuanio / nextformer
View on GitHub
PyTorch implementation of "Nextformer: A ConvNeXt Augmented Conformer For End-To-End Speech Recognition"
☆10Dec 15, 2022Updated 3 years ago
audiodemo / voice-conversion
View on GitHub
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Aug 18, 2023Updated 2 years ago
uthree / tinyvc
View on GitHub
a lightweight voice conversion
☆87Feb 25, 2026Updated 5 months ago
BUTSpeechFIT / DiaPer
View on GitHub
☆69Feb 8, 2024Updated 2 years ago
speechcatcher-asr / speechcatcher-data
View on GitHub
☆11Sep 5, 2025Updated 10 months ago
MiscellaneousStuff / PhoneLM
View on GitHub
(R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.
☆48Sep 4, 2023Updated 2 years ago
ZarahShibli / Arabic_Punctuation_Prediction
View on GitHub
Sequence to sequence model for Arabic punctuation prediction.
☆12Feb 13, 2020Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
7Xin / DPI-TTS
View on GitHub
☆13Sep 12, 2024Updated last year
RicherMans / PSL
View on GitHub
Source code for ICASSP2022 "Pseudo Strong labels for large scale weakly supervised audio tagging"
☆31Apr 29, 2022Updated 4 years ago
WingZLeung / TTDS
View on GitHub
Text-to-dysarthric speech (TTDS) synthesis. An implementation using the Grad-TTS model with the TORGO database.
☆13Mar 15, 2025Updated last year
lissettecarlr / speaker-diarization
View on GitHub
将视频中不同说话人的声音提取后区分保存，得到音频训练数据
☆31May 23, 2024Updated 2 years ago
idiap / knn-tts
View on GitHub
Simple and lightweight Zero-shot Text-to-Speech (TTS) synthesis model
☆36Apr 29, 2025Updated last year
ex3ndr / supervoice-enhance
View on GitHub
Supervoice diffusion enhance
☆28Jul 15, 2024Updated 2 years ago
Zhongxu-Wang / ArtSpeech
View on GitHub
ArtSpeech: Adaptive Text-to-Speech Synthesis with Articulatory Representations
☆22Sep 21, 2025Updated 10 months ago
prairie-schooner / wav2vec-vc
View on GitHub
☆10Mar 22, 2023Updated 3 years ago
odunola499 / f5-lora
View on GitHub
☆19Nov 18, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ogunlao / glowtts_stdp
View on GitHub
Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor
☆19Jun 5, 2023Updated 3 years ago
bootphon / learnable-strf
View on GitHub
Learnable STRF, from Riad et al. 2021 JASA
☆13Aug 21, 2021Updated 4 years ago
choiHkk / Transformer-TTS-V2
View on GitHub
☆25Mar 6, 2024Updated 2 years ago
shinhyeokoh / rwen
View on GitHub
☆14Jun 16, 2023Updated 3 years ago
facebookresearch / llama-hd-dataset
View on GitHub
This is a balanced dataset for English homograph disambiguation (HD), generated with Meta's Llama 2-Chat 70B model.
☆22Jan 22, 2024Updated 2 years ago
kadirnar / fast-dacvae
View on GitHub
☆20Mar 17, 2026Updated 4 months ago
patrickvonplaten / audio-gen-dreambooth
View on GitHub
☆23Jun 13, 2023Updated 3 years ago