Efficient approach to speaker diarization using voice characteristics extraction
☆108Jun 17, 2025Updated 9 months ago
Alternatives and similar repositories for WhoSpeaks
Users that are interested in WhoSpeaks are comparing it to the libraries listed below
Sorting:
- AI at your fingertips: powerful CLI tools for speech, text, and language processing☆22Sep 2, 2024Updated last year
- Command Your World with Voice☆804Jun 17, 2025Updated 9 months ago
- Tr-VAD: An Efficient Transformer based Voice Activity Detection Model☆17Aug 1, 2024Updated last year
- Experimental implementation of regions in WebVTT building on Anne's WebVTT parser.☆14Oct 19, 2014Updated 11 years ago
- Low latency ai companion voice talk in 60 lines of code using faster_whisper and elevenlabs input streaming☆315Jun 17, 2025Updated 9 months ago
- Transcribe desktop audio/computer audio in real-time and locally (Streaming ASR), using TorchAudio and Emformer-RNNT model for inference,…☆14May 7, 2024Updated last year
- A python package to build AI-powered real-time audio applications☆1,942Feb 12, 2025Updated last year
- Simple PyTorch Denoisers for Waveform Audio☆41Mar 1, 2026Updated 2 weeks ago
- replace any object you want on the image with whatever you want☆14Feb 6, 2024Updated 2 years ago
- [Colab Demo Code] OneFormer: One Transformer to Rule Universal Image Segmentation.☆14May 24, 2023Updated 2 years ago
- Roomey is a multi-purpose Voice Agent designed to run your personal and business life.☆61Jun 15, 2025Updated 9 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆71Mar 5, 2026Updated last week
- ☆36Feb 1, 2026Updated last month
- Get started using Deepgram's Live Transcription with this Flask demo app☆43Feb 28, 2026Updated 2 weeks ago
- Simulates talk with an AI that can express emotions☆83Jun 17, 2025Updated 9 months ago
- Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization (ACM MM 2024)☆22Jul 25, 2024Updated last year
- Converts text to speech in realtime☆3,798Jan 11, 2026Updated 2 months ago
- Python Audio Separator in Real Time using MDX-NET model☆25Jul 30, 2023Updated 2 years ago
- A highly-customizable OpenAI gym environment to train & evaluate RL agents trading stocks and crypto.☆21Jun 6, 2023Updated 2 years ago
- An End-to-End Pipeline for Enhanced French Text-to-Speech with SSML Prosody Control☆31Jan 13, 2026Updated 2 months ago
- Identity verification from speech☆19Jul 19, 2022Updated 3 years ago
- Official Deepgram resources for deploying Deepgram services in a self-hosted environment☆35Mar 5, 2026Updated last week
- Real-time processing and delivery of sentences from a continuous stream of characters or text chunks.☆76Jul 13, 2025Updated 8 months ago
- Speaker diarization benchmark framework☆39Jan 8, 2026Updated 2 months ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆76Jul 29, 2024Updated last year
- Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper☆5,423Feb 23, 2026Updated 3 weeks ago
- ☆38Apr 3, 2025Updated 11 months ago
- A universal messaging library for cross-platform applications (Chrome extension, Web, Mobile, Iframe,...)☆15Oct 10, 2025Updated 5 months ago
- Code and model for the AI City Challenge (CVPR 2022) Track 3 Action Detection (Naturalistic Driving Action Recognition)☆28Jul 22, 2023Updated 2 years ago
- Local AI talk with a custom voice based on Zephyr 7B model. Uses RealtimeSTT with faster_whisper for transcription and RealtimeTTS with C…☆714Jun 17, 2025Updated 9 months ago
- auto fine tune of models with synthetic data☆78Feb 14, 2024Updated 2 years ago
- A toolkit for speaker diarization.☆413Mar 4, 2026Updated 2 weeks ago
- Repository for the mijn.amsterdam.nl portal☆11Updated this week
- Image classification for Recyclables☆10Sep 14, 2020Updated 5 years ago
- Voice Transformation for Videos. 🎤👄🎬☆260Jun 17, 2025Updated 9 months ago
- Continual Resilient (CoRe) Optimizer for PyTorch☆11Jun 10, 2024Updated last year
- Cog implementation of transcribing + diarization pipeline with Whisper & Pyannote☆235Feb 19, 2025Updated last year
- List of repositories relevant to VITS.☆36Feb 26, 2023Updated 3 years ago
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆167Dec 12, 2025Updated 3 months ago