CiscoDevNet / vo-idLinks
☆11Updated 3 years ago
Alternatives and similar repositories for vo-id
Users that are interested in vo-id are comparing it to the libraries listed below
Sorting:
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆149Updated last year
- Various speech datasets made available to the public☆121Updated 5 months ago
- ☆38Updated 3 years ago
- Diarization scoring tools.☆247Updated 2 years ago
- ☆85Updated 8 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆51Updated last week
- [deprecated] Pretrained models for pyannote-audio 1.x☆72Updated 2 years ago
- A lightweight library to compute Diarization Error Rate (DER).☆59Updated last year
- ☆43Updated 2 years ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆137Updated last year
- Reproducible experimental protocols for multimedia (audio, video, text) database☆100Updated 3 months ago
- Rescoring methods for end-to-end Automatic Speech Recognition☆27Updated 4 years ago
- ☆294Updated 11 months ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆82Updated 2 years ago
- Support tools for punctuation and boundary detection for ASR output.☆57Updated 2 years ago
- Variational Bayes HMM over x-vectors diarization☆270Updated last year
- Python server for communicating with Kaldi from the browser using WebRTC☆69Updated last year
- ☆46Updated 2 years ago
- Spot the conversation: speaker diarisation in the wild☆140Updated 2 years ago
- Multistream CNN for Robust Acoustic Modeling☆40Updated 3 years ago
- This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"☆119Updated 3 years ago
- Triton backend for https://github.com/OpenNMT/CTranslate2☆35Updated last year
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…☆420Updated 2 months ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Updated 3 years ago
- Whisper finetuned on VinBigdata-VLSP2020-100h + KenLM☆39Updated last year
- ☆79Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆114Updated 2 years ago
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆62Updated 2 months ago
- Online streaming speaker change detection model in Pytorch☆39Updated 2 years ago
- PAFTS : Library That Preprocessing Audio For TTS.☆20Updated 6 months ago