Tr-VAD: An Efficient Transformer based Voice Activity Detection Model
☆18Aug 1, 2024Updated last year
Alternatives and similar repositories for Tr-VAD
Users that are interested in Tr-VAD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pytorch implementation of "spectro-temporal attention-based voice activity detection"☆13Jun 4, 2024Updated 2 years ago
- A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIV…☆24Nov 25, 2024Updated last year
- This is the public repository for SALSA-Lite features for polyphonic sound event localization and detection using microphone arrays.☆15Dec 3, 2021Updated 4 years ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆78Jul 29, 2024Updated last year
- Implementation of CGMM-MVDR beamforming used for Clarity challenge☆14Jan 14, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Voice activity detection and speaker gender segmentation audiovisual corpus☆16Jan 20, 2025Updated last year
- Code for ICASSP 2024 paper WhisperSeg: Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection☆41Jul 25, 2025Updated 10 months ago
- ASLP Summer Inter@NPU☆12Jul 30, 2024Updated last year
- Codebase of the submitted work in ICASSP 2023☆14Nov 30, 2022Updated 3 years ago
- A comprehensive framework to test audio comprehension of Large Audio Language Models.☆65Jun 9, 2026Updated last week
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆23Mar 12, 2023Updated 3 years ago
- An unofficial implementation of the Personal VAD speaker-conditioned voice activity detection method. Bachelor's thesis project.☆87Sep 22, 2022Updated 3 years ago
- 3D Sound Source Localization using Masked Autoencoders☆20Feb 12, 2025Updated last year
- Landing Page for Divide and Remaster v3☆26Jul 29, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Exploring Binary Classification Loss for Speaker Verification☆18Jul 18, 2023Updated 2 years ago
- Fault Injection Automatic Test Equipment☆15Nov 22, 2021Updated 4 years ago
- ☆24Jul 10, 2025Updated 11 months ago
- This is the official implementation of PGUSE☆40Jun 7, 2025Updated last year
- Universal differential equations for ecologists☆15Apr 24, 2026Updated last month
- A toolkit dedicate for speech evaluation.☆23Sep 26, 2024Updated last year
- A scalable solution that simplifies the integration of ComfyUI for developers☆11Jul 15, 2024Updated last year
- Silero VAD(ncnn): pre-trained enterprise-grade Voice Activity Detector.☆26Aug 21, 2024Updated last year
- [AAAI 2026] This is the official implementation of the paper "ExtendAttack: Attacking Servers of LRMs via Extending Reasoning".☆23Mar 18, 2026Updated 3 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆13May 23, 2024Updated 2 years ago
- Welcome to my project. OpenPyVision is a real time videoMixer based on opencv and pyqt6.☆14Aug 22, 2024Updated last year
- including compiler to encode DGL GNN model to instructions, runtime software to transfer data and control the accelerator, and hardware v…☆14Nov 19, 2023Updated 2 years ago
- Reproduction of the paper SFSRNet: Super-resolution for single-channel Audio Source Separation by me (@arda-num) and @dritx16. Navigate P…☆12Jul 7, 2022Updated 3 years ago
- ☆25Aug 29, 2025Updated 9 months ago
- Spherical residual vector quantization (SRVQ)☆31Aug 25, 2024Updated last year
- Simple PyTorch Denoisers for Waveform Audio☆41Apr 4, 2026Updated 2 months ago
- [TMLR 2024] Revisiting Random Weight Perturbation for Efficiently Improving Generalization☆12Oct 18, 2024Updated last year
- Octopus is a neural machine generation toolkit for Arabic Natural Lnagauge Generation (NLG)☆10Apr 29, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official repository for the paper "xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement" (Accepted to INTERSPEECH 2025)☆60Aug 28, 2025Updated 9 months ago
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.☆19Feb 6, 2025Updated last year
- multi-scale time domain speaker extraction☆80Jun 7, 2021Updated 5 years ago
- The program ranked first in Audio-only track of DCASE2024 Challenge task3.☆22Mar 2, 2026Updated 3 months ago
- Vietnamese Punctuation Prediction using Pretrained Language Models☆14May 8, 2022Updated 4 years ago
- 一个CIFAR100数据集的强基线结果☆20Nov 23, 2025Updated 6 months ago