Yifei-ZHAO96 / Tr-VAD
Tr-VAD: An Efficient Transformer based Voice Activity Detection Model
☆11Updated 9 months ago
Alternatives and similar repositories for Tr-VAD
Users that are interested in Tr-VAD are comparing it to the libraries listed below
Sorting:
- Speech enhancement in noisy and reverberant environments using deep neural networks☆20Updated last month
- ☆26Updated 6 months ago
- ☆24Updated last week
- ☆31Updated last month
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆38Updated last month
- Official implementation for FlowSep☆46Updated 4 months ago
- GPT for FACodec☆13Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆27Updated 9 months ago
- Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"☆14Updated 2 years ago
- Unofficial implementation of wavenext vocoder☆45Updated 8 months ago
- Simple and lightweight Zero-shot Text-to-Speech (TTS) synthesis model☆23Updated 2 weeks ago
- Supervoice diffusion enhance☆26Updated 10 months ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆63Updated 9 months ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆68Updated 6 months ago
- ☆19Updated last year
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆49Updated 2 months ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆17Updated 5 months ago
- ☆50Updated last month
- PitchVC: Pitch Conditioned Any-to-Many Voice Conversion☆34Updated 11 months ago
- My vocoder experiments☆28Updated 7 months ago
- ☆41Updated 6 months ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆14Updated 2 months ago
- High quality text-to-speech based on StyleTTS 2.☆42Updated this week
- Official repository for Mamba-based Segmentation Model for Speaker Diarization☆36Updated this week
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆87Updated 4 months ago
- ☆26Updated 3 months ago
- Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals☆17Updated 9 months ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆69Updated last year
- Test code disclosure for the research paper "UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model", as a supplementa…☆20Updated last year
- C++ version of pyannote audio overlapped speech detection pipeline☆13Updated last year