Efficient approach to speaker diarization using voice characteristics extraction
☆108Jun 17, 2025Updated 11 months ago
Alternatives and similar repositories for WhoSpeaks
Users that are interested in WhoSpeaks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Command Your World with Voice☆811Jun 17, 2025Updated 11 months ago
- Transcribe desktop audio/computer audio in real-time and locally (Streaming ASR), using TorchAudio and Emformer-RNNT model for inference,…☆14May 7, 2024Updated 2 years ago
- Tr-VAD: An Efficient Transformer based Voice Activity Detection Model☆18Aug 1, 2024Updated last year
- Simulates talk with an AI that can express emotions☆87Apr 4, 2026Updated last month
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆70Apr 22, 2026Updated 3 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A python package to build AI-powered real-time audio applications☆1,974Feb 12, 2025Updated last year
- Roomey is a multi-purpose Voice Agent designed to run your personal and business life.☆64Jun 15, 2025Updated 11 months ago
- Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.☆79Mar 31, 2026Updated last month
- PAFTS : Library That Preprocessing Audio For TTS.☆27Nov 15, 2024Updated last year
- C++ version of pyannote audio overlapped speech detection pipeline☆13Feb 14, 2024Updated 2 years ago
- Converts text to speech in realtime☆3,919May 10, 2026Updated last week
- Simple PyTorch Denoisers for Waveform Audio☆41Apr 4, 2026Updated last month
- ☆11Sep 28, 2024Updated last year
- Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote☆235Feb 19, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- auto fine tune of models with synthetic data☆78Feb 14, 2024Updated 2 years ago
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆19May 13, 2026Updated last week
- Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)☆22Jul 25, 2024Updated last year
- Exploring Binary Classification Loss for Speaker Verification☆18Jul 18, 2023Updated 2 years ago
- Offline Speaker Diarization with SenseVoice by Sherpa ONNX.☆15Dec 23, 2024Updated last year
- A python package for deep multilingual punctuation prediction.☆164Aug 21, 2024Updated last year
- Transcription and annotation interface for recorded audio or video files☆55May 13, 2026Updated last week
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆77Jul 29, 2024Updated last year
- Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper☆5,519Feb 23, 2026Updated 2 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Local AI talk with a custom voice based on Zephyr 7B model. Uses RealtimeSTT with faster_whisper for transcription and RealtimeTTS with C…☆722Jun 17, 2025Updated 11 months ago
- 🎹 pyannote + 🗒 notebook = pyannotebook☆26Jun 12, 2023Updated 2 years ago
- Local AI photo scoring, culling, and gallery — score, organise, and explore your library with face recognition and semantic search. No cl…☆89Apr 28, 2026Updated 3 weeks ago
- Identity verification from speech☆19Jul 19, 2022Updated 3 years ago
- Svelte app to generate audiobooks using XTTS☆12Feb 13, 2024Updated 2 years ago
- [Colab Demo Code] OneFormer: One Transformer to Rule Universal Image Segmentation.☆14May 24, 2023Updated 2 years ago
- A toolkit for speaker diarization.☆460Apr 9, 2026Updated last month
- C++ version of pyannote audio speaker diarizaiton pipeline☆22Feb 14, 2024Updated 2 years ago
- Custom ComfyUI node that combines VSR + VFI and allows streaming processing for arbitrary video length.☆63Mar 28, 2026Updated last month
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- audio, NLP, ML with huggingface, nvidia/nemo, speechbrain☆11Sep 4, 2023Updated 2 years ago
- A simple Python tool to measure the performance of ONNX models.☆27Sep 15, 2024Updated last year
- A scalable solution that simplifies the integration of ComfyUI for developers☆11Jul 15, 2024Updated last year
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- Get started using Deepgram's Live Transcription with this Flask demo app☆46Apr 11, 2026Updated last month
- An unofficial implementation of the Personal VAD speaker-conditioned voice activity detection method. Bachelor's thesis project.☆84Sep 22, 2022Updated 3 years ago
- 💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.☆219Oct 30, 2024Updated last year