Picovoice / cobra
On-device voice activity detection (VAD) powered by deep learning
β202Updated this week
Alternatives and similar repositories for cobra:
Users that are interested in cobra are comparing it to the libraries listed below
- πΈSTT integration examplesβ126Updated 2 years ago
- ONNX Inference of Pyannote Segmentationβ80Updated 3 months ago
- Open models for Coqui STTβ134Updated last year
- Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.β203Updated 7 months ago
- β39Updated last year
- On-device speaker diarization powered by deep learningβ39Updated this week
- Reproducible experimental protocols for multimedia (audio, video, text) databaseβ98Updated last month
- A live speech recognition using Facebooks wav2vec 2.0 model.β344Updated last year
- Experiments to test different speech recognition systems for SEPIA Frameworkβ59Updated last year
- Diarization scoring tools.β240Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisperβ109Updated 2 years ago
- Onnx wrapper for espnet infrernce modelβ161Updated 5 months ago
- On-device noise suppression powered by deep learningβ68Updated this week
- Voice Activity Detection (VAD) using deep learning.β194Updated 5 years ago
- Various speech datasets made available to the publicβ114Updated 3 months ago
- DeepSpeech based forced alignment toolβ237Updated 4 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ146Updated 10 months ago
- Python bindings of WebRTC Audio Processingβ186Updated 6 months ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event β¦β364Updated last year
- Predicts the level of noise and reverberation on your audiofilesβ146Updated 10 months ago
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervβ¦β134Updated 3 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelinesβ94Updated 10 months ago
- This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"β117Updated 2 years ago
- Segment an audio file and obtain utterance alignments. (Python package)β333Updated 10 months ago
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogramβ243Updated 7 months ago
- Putting flows on top of neural transducers for better TTSβ62Updated 2 weeks ago
- openvino version of openai/whisperβ166Updated last year
- Real-Time Whisper Voice Recognition with vosk model feedback.β112Updated last year
- Target Speaker Extraction Toolkitβ149Updated 2 weeks ago
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team atβ¦β405Updated last month