MorenoLaQuatra/vad

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MorenoLaQuatra/vad)

MorenoLaQuatra / vad

Simple voice activity detection (VAD) algorithm in Python

☆15

Alternatives and similar repositories for vad

Users that are interested in vad are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Mddct / simple-tts
View on GitHub
（WIP）long form speech generatoins
☆30Apr 2, 2025Updated last year
lucadellalib / ts-asr
View on GitHub
Target speaker automatic speech recognition (TS-ASR)
☆14Oct 14, 2023Updated 2 years ago
MorenoLaQuatra / bart-it
View on GitHub
Pre-training BART model for the Italian Language
☆16Dec 28, 2022Updated 3 years ago
jwr1995 / PubSep
View on GitHub
Repository of published DNN speech separation recipes for a number of datasets
☆13Jan 22, 2024Updated 2 years ago
k2-fsa / sherpa-mlx
View on GitHub
sherpa with mlx
☆15Aug 2, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
pengzhendong / compute-wer
View on GitHub
Compute WER and SER for speech recognition evaluation
☆27Jun 6, 2026Updated last month
bond005 / vad
View on GitHub
Various algorithms for voice activity detection
☆22Jan 31, 2017Updated 9 years ago
twardoch / audiostretchy
View on GitHub
AudioStretchy is a Python wrapper around the `audio-stretch` C library, which performs fast, high-quality time-stretching of WAV/MP3 file…
☆60Jul 5, 2026Updated 3 weeks ago
asteroid-team / Libri_VAD
View on GitHub
Script to generate VAD dataset used in Asteroid recipe
☆21Sep 30, 2021Updated 4 years ago
MorenoLaQuatra / audiocaps-download
View on GitHub
This package aims at simplifying the download of the AudioCaps dataset.
☆35Dec 1, 2023Updated 2 years ago
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
pkufool / simple-wer
View on GitHub
A simple command line tool to calculate WER for ASR.
☆14Oct 14, 2024Updated last year
kamilakesbi / DiarizersLM
View on GitHub
☆15Jul 16, 2024Updated 2 years ago
pengzhendong / asr-decoder
View on GitHub
CTC decoder with hotwords for ASR.
☆38Jun 15, 2026Updated last month
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
hlt-mt / Speech-MASSIVE
View on GitHub
Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…
☆25Oct 8, 2025Updated 9 months ago
pengzhendong / speaker-diarization
View on GitHub
Offline Speaker Diarization with SenseVoice by Sherpa ONNX.
☆15Dec 23, 2024Updated last year
pengzhendong / wavesurfer
View on GitHub
For audio visualization and playback in Jupyter notebooks.
☆18Nov 25, 2025Updated 8 months ago
frankyoujian / Edge-Punct-Casing
View on GitHub
☆33Feb 4, 2025Updated last year
dengcunqin / noise-reduction
View on GitHub
noise reduction
☆17Jul 3, 2024Updated 2 years ago
pengzhendong / audiolab
View on GitHub
A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)
☆39Mar 31, 2026Updated 3 months ago
colaudiolab / AudioSet-R
View on GitHub
Official implementation: "AudioSet-R: A Refined AudioSet with Multi-Stage LLM Label Reannotation"
☆19Oct 9, 2025Updated 9 months ago
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
pengzhendong / ngram-punctuator
View on GitHub
An N-gram punctuator for Chinese and English.
☆18Oct 14, 2025Updated 9 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
pengzhendong / audio-pipeline
View on GitHub
☆23Oct 17, 2024Updated last year
MorenoLaQuatra / audioset-download
View on GitHub
This package aims at simplifying the download of the AudioSet dataset.
☆60Jul 17, 2025Updated last year
pengzhendong / streaming-asr
View on GitHub
One command to start a streaming ASR server.
☆12Oct 2, 2024Updated last year
pengzhendong / torchfa
View on GitHub
Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.
☆61Sep 5, 2025Updated 10 months ago
kyegomez / OpenStrawberry
View on GitHub
An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO
☆30Updated this week
Kvothe045 / Audio-Enhancer
View on GitHub
☆13Aug 3, 2025Updated 11 months ago
google-research-datasets / LLAMA1-Test-Set
View on GitHub
We introduce the LLAMA1 Test Set, a comprehensive open-domain world knowledge QA dataset for evaluating question-answering systems. We pr…
☆23Mar 14, 2024Updated 2 years ago
savasy / TC32
View on GitHub
Text Classification Dataset for Turkish Language
☆10Nov 16, 2021Updated 4 years ago
VikhrModels / Salt
View on GitHub
☆60Dec 17, 2025Updated 7 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
pengzhendong / streaming-ChatTTS
View on GitHub
☆23Oct 30, 2024Updated last year
cgarciae / simple_flow_matching
View on GitHub
☆23Dec 16, 2024Updated last year
Slyne / ctc_decoder
View on GitHub
A ctc decoder for both online and offline asr model
☆66Nov 18, 2023Updated 2 years ago
jzshq208886 / wenet_asr
View on GitHub
☆12Jul 11, 2024Updated 2 years ago
elianap / divexplorer
View on GitHub
☆11May 5, 2022Updated 4 years ago
pengzhendong / welm
View on GitHub
One command to build TLG.fst for WeNet.
☆30Oct 11, 2022Updated 3 years ago
boun-tabi / SQuAD-TR
View on GitHub
☆11Jun 8, 2024Updated 2 years ago