KoljaB / WhoSpeaks
Efficient approach to speaker diarization using voice characteristics extraction
β88Updated 9 months ago
Alternatives and similar repositories for WhoSpeaks:
Users that are interested in WhoSpeaks are comparing it to the libraries listed below
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelinesβ92Updated 9 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.β57Updated last week
- π π€ Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloningβ150Updated 7 months ago
- G2Pβ119Updated this week
- β200Updated 4 months ago
- β254Updated 11 months ago
- Real-time Voice Activity Detection (VAD) with some example use case like simple voice bot and live transcription (realtime transcription)β65Updated 8 months ago
- β346Updated 5 months ago
- FastAPI service on top of WhisperXβ67Updated 3 weeks ago
- speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts withβ¦β189Updated last week
- Simulates talk with an AI that can express emotionsβ54Updated 6 months ago
- ONNX Inference of Pyannote Segmentationβ80Updated last month
- β94Updated 9 months ago
- β117Updated 2 months ago
- A testing repo to share code and thoughts on diarisationβ53Updated 10 months ago
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusionβ171Updated 4 months ago
- Text to speech alignment using CTC forced alignmentβ218Updated last month
- π Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. π§π₯π Advanced audio processing.β235Updated 8 months ago
- π¬ ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.β205Updated 3 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restorationβ119Updated last week
- Speaker Diarization with Transformersβ64Updated 8 months ago
- Faster Tortoise inference then Tortoise Fast Forkβ128Updated 9 months ago
- Speech To Speech: an effort for an open-sourced and modular GPT4-oβ40Updated 4 months ago
- Open source inference code for Rev's modelβ377Updated last month
- Real-Time Whisper Voice Recognition with vosk model feedback.β109Updated last year
- [Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translationβ127Updated last week
- Collection of Open Source Speech Dataβ151Updated 3 months ago
- Google's SoundStorm: Efficient Parallel Audio Generationβ131Updated last year
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.β135Updated last year
- Cog implementation of transcribing + diarization pipeline with Whisper & Pyannoteβ192Updated this week