meronym / speaker-diarization
Speaker diarization model
☆23Updated last year
Alternatives and similar repositories for speaker-diarization:
Users that are interested in speaker-diarization are comparing it to the libraries listed below
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆91Updated 8 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆145Updated 8 months ago
- Real-Time Whisper Voice Recognition with vosk model feedback.☆108Updated last year
- Create an LJSpeech structured voice dataset on wave input☆24Updated 4 months ago
- OpenAI Whisper Prompt Examples☆50Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆53Updated last week
- On-device speaker diarization powered by deep learning☆34Updated 2 weeks ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆135Updated last year
- On-device voice activity detection (VAD) powered by deep learning☆192Updated 2 weeks ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆114Updated 2 weeks ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆78Updated last year
- Google's SoundStorm: Efficient Parallel Audio Generation☆130Updated last year
- Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.☆25Updated 2 years ago
- ☆38Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆90Updated 3 months ago
- ☆78Updated 3 months ago
- 🌼 Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition☆16Updated 11 months ago
- Speaker Diarization with Transformers☆64Updated 8 months ago
- Joint speech-language model - respond directly to audio!☆30Updated 8 months ago
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆74Updated 3 weeks ago
- ☆43Updated 7 months ago
- ☆255Updated 7 months ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆140Updated last year
- This is the audio sample repository for speech separation model "MossFormer2".☆120Updated 2 months ago
- ONNX Inference of Pyannote Segmentation☆81Updated last month
- Accelerate Whisper tasks such as transcription, by multiprocesing through parallelization☆25Updated 2 years ago
- 💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.☆203Updated 3 months ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆107Updated last year
- Finetune VITS and MMS using HuggingFace's tools☆131Updated 10 months ago