Picovoice / cobra
On-device voice activity detection (VAD) powered by deep learning
β165Updated 2 weeks ago
Related projects: β
- ONNX Inference of Pyannote Segmentationβ54Updated last week
- πΈSTT integration examplesβ117Updated last year
- β30Updated 7 months ago
- Onnx wrapper for espnet infrernce modelβ152Updated 2 months ago
- Reproducible experimental protocols for multimedia (audio, video, text) databaseβ79Updated 5 months ago
- Predicts the level of noise and reverberation on your audiofilesβ134Updated 3 months ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event β¦β312Updated 6 months ago
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervβ¦β126Updated 3 months ago
- Open models for Coqui STTβ119Updated last year
- On-device noise suppression powered by deep learningβ59Updated 2 weeks ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ132Updated 4 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisperβ95Updated last year
- This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"β105Updated 2 years ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelinesβ81Updated 4 months ago
- Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.β194Updated last month
- Variational Bayes HMM over x-vectors diarizationβ251Updated 8 months ago
- Experiments to test different speech recognition systems for SEPIA Frameworkβ57Updated last year
- Diarization scoring tools.β213Updated last year
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team atβ¦β345Updated this week
- Segment an audio file and obtain utterance alignments. (Python package)β319Updated 4 months ago
- An online speech recognition extension toolkit of Kaldiβ57Updated 3 years ago
- A tokenizer, text cleaner, and phonemizer for many human languages.β272Updated 2 months ago
- A python library for voice activity detection (VAD) for speech/non-speech segmentation.β80Updated 2 years ago
- C++ version of pyannote audio speaker diarizaiton pipelineβ17Updated 7 months ago
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBOβ56Updated last year
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.β64Updated 11 months ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!β127Updated this week
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detectionβ36Updated last month
- Wav2Keyword is keyword spotting(KWS) based on Wav2Vec 2.0. This model shows state-of-the-art in Speech commands dataset V1 and V2.β98Updated last year
- Tunable pipelinesβ26Updated 3 weeks ago