Yifei-ZHAO96/Tr-VAD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Yifei-ZHAO96/Tr-VAD)

Yifei-ZHAO96 / Tr-VAD

Tr-VAD: An Efficient Transformer based Voice Activity Detection Model

☆18

Alternatives and similar repositories for Tr-VAD

Users that are interested in Tr-VAD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Yifei-ZHAO96 / STAM-pytorch
View on GitHub
Pytorch implementation of "spectro-temporal attention-based voice activity detection"
☆13Jun 4, 2024Updated 2 years ago
thomeou / SALSA-Lite
View on GitHub
This is the public repository for SALSA-Lite features for polyphonic sound event localization and detection using microphone arrays.
☆15Dec 3, 2021Updated 4 years ago
JethroWangSir / SincQDR-VAD
View on GitHub
☆26Aug 29, 2025Updated 11 months ago
ddxsg24 / Personalized-Speech-Enhancement
View on GitHub
ASLP Summer Inter@NPU
☆13Jul 30, 2024Updated last year
ina-foss / InaGVAD
View on GitHub
Voice activity detection and speaker gender segmentation audiovisual corpus
☆16Jan 20, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
BUTSpeechFIT / cgmm_mvdr_online
View on GitHub
Implementation of CGMM-MVDR beamforming used for Clarity challenge
☆14Jan 14, 2022Updated 4 years ago
HuPER29 / HuPER
View on GitHub
☆16Mar 19, 2026Updated 4 months ago
prerak23 / Dir_SrcMic_DOA
View on GitHub
Codebase of the submitted work in ICASSP 2023
☆14Nov 30, 2022Updated 3 years ago
nianlonggu / WhisperSeg
View on GitHub
Code for ICASSP 2024 paper WhisperSeg: Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection
☆42Jul 25, 2025Updated last year
Nikait / FastWave
View on GitHub
FastWave is a lightweight diffusion model for general audio super-resolution (any -> 48 kHz). SOTA quality reconstruction metrics with ju…
☆18May 16, 2026Updated 2 months ago
fclearner / Personal-vad-2.0
View on GitHub
Implementation of "Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition"
☆16Jun 9, 2026Updated last month
nanless / universal-speech-enhancement
View on GitHub
Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…
☆83Jul 29, 2024Updated 2 years ago
pirxus / personalVAD
View on GitHub
An unofficial implementation of the Personal VAD speaker-conditioned voice activity detection method. Bachelor's thesis project.
☆90Sep 22, 2022Updated 3 years ago
HolgerBovbjerg / SSL-PVAD
View on GitHub
A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIV…
☆25Nov 25, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
zhenghuatan / rVADfast
View on GitHub
This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…
☆154Updated this week
IU-SAIGE / pse
View on GitHub
Efficient Personalized Speech Enhancement through Self-Supervised Learning
☆23Mar 12, 2023Updated 3 years ago
byuccl / fiate
View on GitHub
Fault Injection Automatic Test Equipment
☆16Nov 22, 2021Updated 4 years ago
SELMA-project / ml4audio
View on GitHub
audio, NLP, ML with huggingface, nvidia/nemo, speechbrain
☆11Sep 4, 2023Updated 2 years ago
facebookresearch / SS2_HRTF
View on GitHub
SS2 HRTF Dataset - Reality Labs Research Audio
☆18May 22, 2026Updated 2 months ago
Taichi-Pink / LightFuse-Lightweight-CNN-based-Dual-exposure-Fusion
View on GitHub
☆11Sep 22, 2022Updated 3 years ago
Hunterhuan / sphereface2_speaker_verification
View on GitHub
Exploring Binary Classification Loss for Speaker Verification
☆18Jul 18, 2023Updated 3 years ago
chentuochao / Sound_Bubble
View on GitHub
Project for speech bubble
☆66Aug 15, 2025Updated 11 months ago
lllibano / LABFT
View on GitHub
A parametric RTL code generator of an efficient integer MxM Systolic Array implementation for Xilinx FPGAs, with error detection capabili…
☆14Aug 28, 2025Updated 11 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
ftshijt / speech_evaluation
View on GitHub
A toolkit dedicate for speech evaluation.
☆23Sep 26, 2024Updated last year
wxqwinner / silero-vad-ncnn
View on GitHub
Silero VAD(ncnn): pre-trained enterprise-grade Voice Activity Detector.
☆26Aug 21, 2024Updated last year
inverse-ai / FINALLY-Speech-Enhancement
View on GitHub
FINALLY: Fast and universal speech enhancement model delivering studio-quality audio for a wide range of recordings.
☆28Apr 1, 2026Updated 3 months ago
arda-num / SFSRNet
View on GitHub
Reproduction of the paper SFSRNet: Super-resolution for single-channel Audio Source Separation by me (@arda-num) and @dritx16. Navigate P…
☆12Jul 7, 2022Updated 4 years ago
I-Doctor / gnn-acceleration-framework-with-FPGA
View on GitHub
including compiler to encode DGL GNN model to instructions, runtime software to transfer data and control the accelerator, and hardware v…
☆14Nov 19, 2023Updated 2 years ago
nblt / mARWP
View on GitHub
[TMLR 2024] Revisiting Random Weight Perturbation for Efficiently Improving Generalization
☆12Oct 18, 2024Updated last year
Intersection98 / ComfyUI_MX_post_processing-nodes
View on GitHub
☆13May 23, 2024Updated 2 years ago
zephyrchien / ztun
View on GitHub
TCP tunnel powered by epoll
☆15Dec 16, 2021Updated 4 years ago
axeber01 / wav2pos
View on GitHub
3D Sound Source Localization using Masked Autoencoders
☆21Feb 12, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
talhanai / kaldi-diar-latte
View on GitHub
steps to perform text-based speaker diarization with kaldi toolkit
☆12Nov 2, 2018Updated 7 years ago
AlessioMichelassi / openPyVision_013
View on GitHub
Welcome to my project. OpenPyVision is a real time videoMixer based on opencv and pyqt6.
☆14Aug 22, 2024Updated last year
kwatcharasupat / divide-and-remaster-v3
View on GitHub
Landing Page for Divide and Remaster v3
☆26Jul 29, 2025Updated last year
ACS-Storage-Group / FlexRaft-Code
View on GitHub
☆12Mar 11, 2024Updated 2 years ago
yluo42 / SRVQ
View on GitHub
Spherical residual vector quantization (SRVQ)
☆31Aug 25, 2024Updated last year
UBC-NLP / octopus
View on GitHub
Octopus is a neural machine generation toolkit for Arabic Natural Lnagauge Generation (NLG)
☆10Apr 29, 2024Updated 2 years ago
logikon-ai / cot-eval
View on GitHub
A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.
☆19Feb 6, 2025Updated last year