meronym / speaker-diarization
Speaker diarization model
☆18Updated last year
Related projects: ⓘ
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆132Updated 4 months ago
- Stable timestamps and confidence score for words of OpenAI's Whisper outputs down to word-level.☆25Updated last year
- A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the Art" models☆65Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆132Updated 8 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆81Updated 4 months ago
- On-device voice activity detection (VAD) powered by deep learning☆165Updated 2 weeks ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆132Updated last year
- Your one-stop solution for voice dataset creation☆106Updated 9 months ago
- ONNX Inference of Pyannote Segmentation☆54Updated last week
- Zero-shot Audio Classification using Whisper☆74Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆95Updated last year
- Create an LJSpeech structured voice dataset on wave input☆16Updated 2 months ago
- Real-Time Whisper Voice Recognition with vosk model feedback.☆103Updated last year
- Google's SoundStorm: Efficient Parallel Audio Generation☆115Updated last year
- ☆38Updated 2 years ago
- ☆30Updated 7 months ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆64Updated 11 months ago
- 😎 Awesome lists about Speech Emotion Recognition☆57Updated this week
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆81Updated last year
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆82Updated last year
- 56 language, 1 model Multilingual ASR☆23Updated 3 years ago
- ASRecognition: just an easy-to-use library for Automatic Speech Recognition.☆51Updated last year
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆47Updated last year
- Reproducible experimental protocols for multimedia (audio, video, text) database☆79Updated 5 months ago
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆149Updated 6 months ago
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO☆56Updated last year
- ☆248Updated 3 months ago
- Tunable pipelines☆26Updated 3 weeks ago
- Finetune VITS and MMS using HuggingFace's tools☆112Updated 5 months ago
- Putting flows on top of neural transducers for better TTS☆63Updated last month