Picovoice / cobra
On-device voice activity detection (VAD) powered by deep learning
☆198Updated this week
Alternatives and similar repositories for cobra:
Users that are interested in cobra are comparing it to the libraries listed below
- On-device speaker diarization powered by deep learning☆38Updated last week
- Reproducible experimental protocols for multimedia (audio, video, text) database☆96Updated last week
- Variational Bayes HMM over x-vectors diarization☆263Updated last year
- ☆39Updated last year
- This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at…☆396Updated 3 months ago
- Open models for Coqui STT☆129Updated last year
- Voice Activity Detection (VAD) using deep learning.☆194Updated 5 years ago
- Experiments to test different speech recognition systems for SEPIA Framework☆58Updated last year
- Diarization scoring tools.☆235Updated last year
- ONNX Inference of Pyannote Segmentation☆80Updated last month
- Segment an audio file and obtain utterance alignments. (Python package)☆328Updated 9 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆92Updated 9 months ago
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆355Updated last year
- On-device noise suppression powered by deep learning☆66Updated this week
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.☆286Updated last year
- Predicts the level of noise and reverberation on your audiofiles☆144Updated 8 months ago
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram☆240Updated 6 months ago
- Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice.☆202Updated 6 months ago
- This Repostory contains the pretrained DTLN-aec model for real-time acoustic echo cancellation.☆291Updated 2 years ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆135Updated last year
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆145Updated 9 months ago
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…☆131Updated 2 months ago
- Various speech datasets made available to the public☆113Updated 2 months ago
- 🐸 - A general purpose model trainer, as flexible as it gets☆205Updated 11 months ago
- A non-native English corpus for pronunciation scoring task☆123Updated 7 months ago
- Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three differen…☆220Updated 2 years ago
- Kaldi-compatible online fbank extractor without external dependencies☆87Updated 2 months ago
- A tokenizer, text cleaner, and phonemizer for many human languages.☆303Updated 3 months ago
- This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"☆116Updated 2 years ago
- A live speech recognition using Facebooks wav2vec 2.0 model.☆341Updated last year