NickWilkinson37/voxseg

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/NickWilkinson37/voxseg)

NickWilkinson37 / voxseg

A python library for voice activity detection (VAD) for speech/non-speech segmentation.

☆88

Alternatives and similar repositories for voxseg

Users that are interested in voxseg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

voithru / voice-activity-detection
View on GitHub
Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021
☆159Oct 26, 2021Updated 4 years ago
jymsuper / VAD_tutorial
View on GitHub
Simple DNN based Voice Activity Detection (VAD) using Pytorch
☆43Feb 8, 2020Updated 6 years ago
skgusrb12 / voice_activity_detection
View on GitHub
Pytorch version of Voice Activity Detection (VAD) based on Deep Learning (https://github.com/filippogiruzzi)
☆27Mar 20, 2021Updated 5 years ago
zhenghuatan / rVADfast
View on GitHub
This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…
☆154Updated this week
RicherMans / GPV
View on GitHub
Repository for our Interspeech2020 general-purpose voice activity detection (GPVAD) paper
☆141Aug 3, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
farisalasmary / wav2vec2-kenlm
View on GitHub
Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding
☆74Oct 11, 2021Updated 4 years ago
maveryn / robust-vad
View on GitHub
Lightweight CNN for Robust Voice Activity Detection
☆20Jun 30, 2023Updated 3 years ago
RicherMans / Datadriven-GPVAD
View on GitHub
The codebase for Data-driven general-purpose voice activity detection.
☆93Aug 3, 2023Updated 2 years ago
danielbraithwt / Speech-Enhancement-with-Variance-Constrained-Autoencoders
View on GitHub
Code and audio files associated with the paper "Speech Enhancement with Variance Constrained Autoencoders" presented at Interspeech 2019
☆15Oct 10, 2019Updated 6 years ago
VKW2021 / kaldi-baseline
View on GitHub
kaldi cnn-tdnnf baseline
☆13Aug 31, 2021Updated 4 years ago
zhenghuatan / rVAD
View on GitHub
Matlab and Python libraries for an unsupervised method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsupervised …
☆140Jan 20, 2024Updated 2 years ago
nicklashansen / voice-activity-detection
View on GitHub
Voice Activity Detection (VAD) using deep learning.
☆204Oct 14, 2019Updated 6 years ago
patrickvonplaten / Wav2Vec2_ParlanceCTCDecode
View on GitHub
☆11Nov 5, 2021Updated 4 years ago
Yifei-ZHAO96 / STAM-pytorch
View on GitHub
Pytorch implementation of "spectro-temporal attention-based voice activity detection"
☆13Jun 4, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
qiujiali / lattice-rescore
View on GitHub
☆16Jun 13, 2022Updated 4 years ago
htqin / BiFSMN
View on GitHub
Pytorch implementation of BiFSMN, IJCAI 2022
☆22Feb 10, 2023Updated 3 years ago
yuhangear / wenet-android
View on GitHub
☆13Oct 27, 2021Updated 4 years ago
bagustris / w2v2-vad
View on GitHub
A wrapper for Audeering's wav2vec-based dimensional speech emotion recognition
☆22Aug 9, 2023Updated 2 years ago
daniel03c1 / NAS_VAD
View on GitHub
☆26Oct 25, 2024Updated last year
jsvir / vad
View on GitHub
[Tiny VAD] SG-VAD: Stochastic Gates Based Speech Activity Detection
☆40Mar 24, 2025Updated last year
ruanvdmerwe / triplet-entropy-loss
View on GitHub
Project repository for the work done in Triplet Entropy Loss: Improving The Generalization of Short Speech Language Identification Syst…
☆13Feb 17, 2021Updated 5 years ago
filippogiruzzi / voice_activity_detection
View on GitHub
Voice Activity Detection based on Deep Learning & TensorFlow
☆373Jul 22, 2026Updated last week
kamperh / recipe_swbd_wordembeds
View on GitHub
☆22Mar 22, 2017Updated 9 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
RF5 / transfusion-asr
View on GitHub
Transcribing Speech with Multinomial Diffusion, training code and models.
☆80Sep 27, 2023Updated 2 years ago
kamperh / vqwordseg
View on GitHub
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.
☆39May 5, 2026Updated 2 months ago
Okrio / tinyrecurrentunet
View on GitHub
Real-Time De-noising and De-reverbing with Tiny Recurrent UNet
☆56Jun 7, 2023Updated 3 years ago
RapidAI / RapidSpeech.cpp
View on GitHub
On-device speech AI runtime for ASR, TTS, VAD, and voice cloning. Python-simple, C++-native, GGUF-powered.
☆22Jul 15, 2026Updated 2 weeks ago
doerlbh / MiniVox
View on GitHub
Code for our ACML and INTERSPEECH papers: "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox".
☆29Sep 20, 2021Updated 4 years ago
huaidanquede / Dense-TSNet
View on GitHub
offical code for Dense-TSNet
☆12Sep 17, 2024Updated last year
HawkAaron / RNN-Transducer
View on GitHub
MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks
☆140Jun 7, 2021Updated 5 years ago
sarulab-speech / jtubespeech
View on GitHub
☆233Nov 13, 2023Updated 2 years ago
MuyangDu / T5Voice
View on GitHub
T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …
☆28Nov 7, 2025Updated 8 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
gbegus / DeepPhonologyTool
View on GitHub
Train a fiwGAN or ciwGAN model using your own training data
☆14Oct 13, 2022Updated 3 years ago
LeBenchmark / Interspeech2021
View on GitHub
This repository describes our reproducible framework for assessing self-supervised representation learning from speech
☆52Oct 8, 2021Updated 4 years ago
SIP-Lab / CNN-VAD
View on GitHub
A Convolutional Neural Network based Voice Activity Detector for Smartphones
☆70Apr 30, 2019Updated 7 years ago
zizyzhang / DNN-Based-Singing-Voice-Synthesis
View on GitHub
DNN based singing voice synthesis
☆17Oct 15, 2018Updated 7 years ago
cornerfarmer / ctc_segmentation
View on GitHub
Segment a given audio into utterances using a trained end-to-end ASR model.
☆75Oct 9, 2020Updated 5 years ago
ICASSP2021-tutorial9 / Distant_conversational_ASR_and_analysis
View on GitHub
☆12Jun 10, 2021Updated 5 years ago
Ephrem-ETH / E2E-KWS
View on GitHub
End-to-End Keyword Spotting (E2E-KWS) using a character level LSTM
☆45Nov 18, 2022Updated 3 years ago