Tr-VAD: An Efficient Transformer based Voice Activity Detection Model
☆17Aug 1, 2024Updated last year
Alternatives and similar repositories for Tr-VAD
Users that are interested in Tr-VAD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pytorch implementation of "spectro-temporal attention-based voice activity detection"☆13Jun 4, 2024Updated last year
- A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIV…☆21Nov 25, 2024Updated last year
- This is the public repository for SALSA-Lite features for polyphonic sound event localization and detection using microphone arrays.☆14Dec 3, 2021Updated 4 years ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆76Jul 29, 2024Updated last year
- Implementation of CGMM-MVDR beamforming used for Clarity challenge☆14Jan 14, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Voice activity detection and speaker gender segmentation audiovisual corpus☆16Jan 20, 2025Updated last year
- Code for ICASSP 2024 paper WhisperSeg: Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection☆41Jul 25, 2025Updated 8 months ago
- ASLP Summer Inter@NPU☆12Jul 30, 2024Updated last year
- Codebase of the submitted work in ICASSP 2023☆14Nov 30, 2022Updated 3 years ago
- A comprehensive framework to test audio comprehension of Large Audio Language Models.☆61Apr 3, 2026Updated 2 weeks ago
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆23Mar 12, 2023Updated 3 years ago
- This is the Python library for an unsupervised, fast method for robust voice activity detection (rVAD), as in the paper rVAD: An Unsuperv…☆152Jun 5, 2025Updated 10 months ago
- An unofficial implementation of the Personal VAD speaker-conditioned voice activity detection method. Bachelor's thesis project.☆81Sep 22, 2022Updated 3 years ago
- Universal differential equations for ecologists☆14Mar 24, 2026Updated 3 weeks ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 3D Sound Source Localization using Masked Autoencoders☆19Feb 12, 2025Updated last year
- Landing Page for Divide and Remaster v3☆26Jul 29, 2025Updated 8 months ago
- Exploring Binary Classification Loss for Speaker Verification☆18Jul 18, 2023Updated 2 years ago
- ☆22Jul 10, 2025Updated 9 months ago
- This is the official implementation of PGUSE☆39Jun 7, 2025Updated 10 months ago
- A toolkit dedicate for speech evaluation.☆23Sep 26, 2024Updated last year
- A scalable solution that simplifies the integration of ComfyUI for developers☆11Jul 15, 2024Updated last year
- Silero VAD(ncnn): pre-trained enterprise-grade Voice Activity Detector.☆25Aug 21, 2024Updated last year
- [AAAI 2026] This is the official implementation of the paper "ExtendAttack: Attacking Servers of LRMs via Extending Reasoning".☆22Mar 18, 2026Updated last month
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆13May 23, 2024Updated last year
- Welcome to my project. OpenPyVision is a real time videoMixer based on opencv and pyqt6.☆14Aug 22, 2024Updated last year
- Reproduction of the paper SFSRNet: Super-resolution for single-channel Audio Source Separation by me (@arda-num) and @dritx16. Navigate P…☆11Jul 7, 2022Updated 3 years ago
- ☆25Aug 29, 2025Updated 7 months ago
- state-of-the-art models for diacritics restoration for Arabic language☆17Feb 23, 2025Updated last year
- Official repository for the paper "xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement" (Accepted to INTERSPEECH 2025)☆58Aug 28, 2025Updated 7 months ago
- Simple PyTorch Denoisers for Waveform Audio☆41Apr 4, 2026Updated 2 weeks ago
- Octopus is a neural machine generation toolkit for Arabic Natural Lnagauge Generation (NLG)☆10Apr 29, 2024Updated last year
- A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.☆19Feb 6, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- The program ranked first in Audio-only track of DCASE2024 Challenge task3.☆22Mar 2, 2026Updated last month
- multi-scale time domain speaker extraction☆75Jun 7, 2021Updated 4 years ago
- Vietnamese Punctuation Prediction using Pretrained Language Models☆14May 8, 2022Updated 3 years ago
- ☆82Mar 3, 2026Updated last month
- Fast Punctuation Restoration using Transformer Models for Vietnamese☆11Jun 10, 2022Updated 3 years ago
- Permutation invariant training in PyTorch☆13Oct 2, 2020Updated 5 years ago
- IPA Phonetic dataset lexicon☆18Mar 20, 2026Updated 3 weeks ago