jsvir/vad

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jsvir/vad)

jsvir / vad

[Tiny VAD] SG-VAD: Stochastic Gates Based Speech Activity Detection

☆40

Alternatives and similar repositories for vad

Users that are interested in vad are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jiwonix / Sound-Event-Detection-papers
View on GitHub
Sound Event Detection (SED) paper collection
☆15Jun 26, 2024Updated 2 years ago
jsvir / sparknet
View on GitHub
[Tiny KWS] SparkNet: Sparse Binarization for Fast Keyword Spotting
☆20Aug 26, 2025Updated 10 months ago
swagshaw / TorchKWS
View on GitHub
Collection of PyTorch implementations of Spoken Keyword Spotting presented in research papers.
☆41Apr 5, 2024Updated 2 years ago
Qualcomm-AI-research / bcresnet
View on GitHub
☆100May 31, 2023Updated 3 years ago
NikolaiKyhne / xLSTM-SENet
View on GitHub
Official repository for the paper "xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement" (Accepted to INTERSPEECH 2025)
☆60Aug 28, 2025Updated 10 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
lugan113 / SynTTS-Commands-Official
View on GitHub
SynTTS-Commands is a large-scale, multilingual (English & Chinese) synthetic speech command dataset designed for low-power Keyword Spotti…
☆17Feb 5, 2026Updated 5 months ago
kaistmm / Metric-UD-KWS
View on GitHub
Official code for Metric learning for user-defined keyword spotting
☆40Feb 21, 2024Updated 2 years ago
FrenchKrab / IS2023-powerset-diarization
View on GitHub
Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.
☆96Oct 18, 2023Updated 2 years ago
mmmgalleria / Dual-Microphone-Noise-Reduction-by-PLD-Technique
View on GitHub
Working on a dual-microphone noise reduction for mobile phone in noisy environment by Power Level Different Technique (PLD).
☆17Jul 25, 2020Updated 5 years ago
yqcai888 / DCASE2023
View on GitHub
2022 DCASE Challenge
☆14Sep 30, 2024Updated last year
YosukeHiguchi / espnet
View on GitHub
End-to-End Speech Processing Toolkit
☆16Jan 20, 2025Updated last year
Tokisywyy / AECNS-CAGCRN
View on GitHub
☆18Mar 9, 2025Updated last year
merlresearch / reverberation-as-supervision
View on GitHub
Enhanced Reverberation As Supervision (ERAS) for unsupervised reverberant speech separation
☆15Aug 1, 2024Updated last year
JethroWangSir / SincQDR-VAD
View on GitHub
☆26Aug 29, 2025Updated 10 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
yqcai888 / easy_dcase_task1
View on GitHub
This repository provides an easy way to train your models on the datasets of DCASE task 1.
☆20May 28, 2025Updated last year
desh2608 / dover-lap
View on GitHub
Python package for combining diarization system outputs.
☆94Oct 12, 2023Updated 2 years ago
Maokui-He / NSD-MA-MSE
View on GitHub
A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"
☆62Sep 19, 2024Updated last year
Mo-yun / DSDPRNN
View on GitHub
Implementation of Dual-Stream DPRNN (paper: Nonlinear Residual Echo Suppression Based on Dual-Stream DPRNN)
☆21Jul 15, 2021Updated 5 years ago
xiaochunxin / OMLSA-MCRA
View on GitHub
C++ speech enhancement base on OMLSA-MCRA
☆63Aug 4, 2020Updated 5 years ago
muggle-stack / sensevoice_cpp
View on GitHub
☆25Mar 8, 2026Updated 4 months ago
YUCHEN005 / Gradient-Remedy
View on GitHub
Code for paper "Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition"
☆21May 24, 2023Updated 3 years ago
yinruiqing / tiny-transducer
View on GitHub
Tiny Transducer: A Highly-Efficient Speech Recognition Model on Edge Devices
☆30Aug 4, 2022Updated 3 years ago
jmcasebeer / autodsp
View on GitHub
Train custom adaptive filter optimizers without hand tuning or extra labels.
☆67Oct 14, 2021Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Sreyan88 / LAPE
View on GitHub
A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)
☆29Jul 9, 2024Updated 2 years ago
XiaoxiangGao / Dual_Channel_Beamformer_and_Postfilter
View on GitHub
This project gives an example of dual microphone speech enhancement based on GSC beamformer and multiple channel postfilter.
☆104Aug 22, 2018Updated 7 years ago
JYWanng / LBCCN
View on GitHub
☆27May 5, 2025Updated last year
jolin830 / SlowFast-Meet-ViT
View on GitHub
We have implemented Track # 1 for ICME 2024: Spatial Action Localization on Chaotic World dataset. Our mAP on the validation set reaches …
☆14Nov 11, 2024Updated last year
NickWilkinson37 / voxseg
View on GitHub
A python library for voice activity detection (VAD) for speech/non-speech segmentation.
☆88Sep 7, 2022Updated 3 years ago
wangchengzhong / GRE-Net
View on GitHub
Official Repository for "Global Rotation Equivariant Phase Modeling for Speech Enhancement with Deep Magnitude-Phase Interaction"
☆19Jun 25, 2026Updated 3 weeks ago
huaidanquede / PrimeK-Net
View on GitHub
PrimeK-Net official code
☆29Mar 5, 2025Updated last year
hainan-xv / PASM
View on GitHub
Pronunciation-assisted Subword Modeling
☆31May 30, 2019Updated 7 years ago
daniel03c1 / NAS_VAD
View on GitHub
☆26Oct 25, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
biboamy / AVASpeech_Music_Labels
View on GitHub
☆20Nov 3, 2021Updated 4 years ago
joonaskalda / PixIT
View on GitHub
Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…
☆105Jan 10, 2025Updated last year
soham97 / awesome-sound_event_detection
View on GitHub
Reading list for research topics in Sound AI
☆201Aug 8, 2024Updated last year
cakimpei / khanaa
View on GitHub
A tool to make spelling Thai more convenient
☆12Mar 30, 2024Updated 2 years ago
helloooideeeeea / RealTimeCutVADCXXLibrary
View on GitHub
C++ implementation of real-time Voice Activity Detection (VAD) using Silero models with ONNX Runtime and WebRTC Audio Processing. Provide…
☆14Feb 19, 2026Updated 5 months ago
zhenghuatan / rVAD
View on GitHub
Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised …
☆140Jan 20, 2024Updated 2 years ago
OmarMedhat22 / Sound-Classification-Short-Time-Fourier-Transform-STFT
View on GitHub
☆15May 28, 2020Updated 6 years ago