Yifei-ZHAO96 / Tr-VADLinks
Tr-VAD: An Efficient Transformer based Voice Activity Detection Model
☆12Updated 10 months ago
Alternatives and similar repositories for Tr-VAD
Users that are interested in Tr-VAD are comparing it to the libraries listed below
Sorting:
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated 3 weeks ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆63Updated 10 months ago
- Speech enhancement in noisy and reverberant environments using deep neural networks☆20Updated 2 months ago
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆38Updated this week
- This repository provides an implementation of the DPCCN model for single-channel speech separation. More details will be updated soon.☆13Updated 3 years ago
- ☆26Updated 4 months ago
- Official implementation for FlowSep☆50Updated 5 months ago
- ☆24Updated last month
- We implemented the DEMUCS model for speech enhancement in the time-frequency domain, and additionally implemented HD-DEMUCS.☆29Updated last year
- ☆26Updated 7 months ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆19Updated last year
- GPT for FACodec☆13Updated last year
- An implementation for Frame-level Speech Signal-to-Noise Ratio Estimation using deep learning☆38Updated 3 years ago
- Streaming Audiotransformers for online Audio tagging☆44Updated 11 months ago
- A collection of all our phonemeizers for dataset construction and inference☆23Updated 3 months ago
- ☆19Updated last year
- This is the official implementation of reverberant speech to room impulse response estimator☆31Updated 10 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆90Updated 5 months ago
- Accompanying repository for the paper "DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions"☆25Updated 3 weeks ago
- Official repository for Mamba-based Segmentation Model for Speaker Diarization☆36Updated 3 weeks ago
- Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"☆14Updated 2 years ago
- Cantonese Grapheme-to-Phoneme Converter based on GitYCC/g2pW☆13Updated 5 months ago
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆18Updated last week
- Unofficial implementation of wavenext vocoder☆46Updated 9 months ago
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆49Updated 2 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated last week
- Test Framework for few-shot open set KWS☆31Updated 6 months ago
- Fully Quantized Neural Networks For Speech Enhancement☆61Updated last year
- A toolkit for researchers in the multimodal sound separation.☆16Updated last year
- offical code for Dense-TSNet☆12Updated 8 months ago