Picovoice / voice-activity-benchmarkLinks
Voice activity engine benchmark framework
☆15Updated last month
Alternatives and similar repositories for voice-activity-benchmark
Users that are interested in voice-activity-benchmark are comparing it to the libraries listed below
Sorting:
- ☆56Updated 2 years ago
- A handy dataset of noises for ASR☆21Updated 6 years ago
- ☆17Updated 2 years ago
- ☆28Updated 4 months ago
- This repository contains all the code necessary for running the multilingual distilwhisper from Ferraz et al. 2024 IEEE ICASSP paper.☆24Updated last year
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Updated 3 years ago
- ☆34Updated last year
- Lightweight wrapper for Silero VAD using internal ONNX Runtime and with no python package dependencies☆14Updated 7 months ago
- Clustering-based methods for overlapping diarization☆80Updated last year
- ☆37Updated last month
- Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"☆23Updated 2 years ago
- ☆12Updated 4 months ago
- ☆38Updated 3 years ago
- [ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…☆31Updated 3 years ago
- Rescoring methods for end-to-end Automatic Speech Recognition☆27Updated 4 years ago
- 56 language, 1 model Multilingual ASR☆25Updated 3 years ago
- Segment a given audio into utterances using a trained end-to-end ASR model.☆73Updated 4 years ago
- asr2k☆50Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆52Updated last month
- A lightweight library to compute Diarization Error Rate (DER).☆59Updated last year
- ☆16Updated last year
- Pronunciation-assisted Subword Modeling☆29Updated 6 years ago
- Implementation of the contextual biasing for ASR decoding on GPUs without lattice generation. The code supports submission to Interspeech…☆20Updated last year
- Decoders from Kaldi using OpenFst☆29Updated 5 months ago
- ☆61Updated last year
- Phoneme alignment representation compatible with multiple forced aligners☆21Updated last year
- ☆12Updated 3 years ago
- Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup☆71Updated 9 months ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆102Updated 4 months ago
- Python wrapper for OpenFST and its extensions from Kaldi. Also support reading/writing ark/scp files☆53Updated 2 months ago