JethroWangSir / SincQDR-VADView external linksLinks
☆24Aug 29, 2025Updated 5 months ago
Alternatives and similar repositories for SincQDR-VAD
Users that are interested in SincQDR-VAD are comparing it to the libraries listed below
Sorting:
- Both audio-only and audio-visual speaker diarization datasets are listed here.☆14Feb 22, 2023Updated 2 years ago
- Voice activity detection and speaker gender segmentation audiovisual corpus☆16Jan 20, 2025Updated last year
- ☆17Oct 18, 2023Updated 2 years ago
- FNSE-SBGAN: Far-field Speech Enhancement with Schrödinger Bridge and Generative Adversarial Networks☆17May 12, 2025Updated 9 months ago
- Official PyTorch implementation of 'Rec-RIR: Monaural Blind Room Impulse Response Identification via DNN-based Reverberant Speech Reconst…☆29Dec 25, 2025Updated last month
- TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings☆37Oct 27, 2025Updated 3 months ago
- ☆12Apr 18, 2025Updated 9 months ago
- ☆32Oct 23, 2025Updated 3 months ago
- A toolkit for researchers in the multimodal sound separation.☆16Oct 20, 2023Updated 2 years ago
- Official Repository for "Efficient Vocal Source Separation Through Windowed RoFormer"☆42Oct 30, 2025Updated 3 months ago
- Variations of L1 SNR Loss function for training audio source separation machine learning models☆44Feb 4, 2026Updated last week
- ☆44Sep 19, 2024Updated last year
- ☆50Jan 28, 2026Updated 2 weeks ago
- Official Implementation of LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models.☆32Nov 9, 2025Updated 3 months ago
- Official code of SenSE.☆72Oct 30, 2025Updated 3 months ago
- RWKV-SpeechChat is a real-time dialogue script based on a frozen 3B RWKV model with trained adapters and initial states. Various trained …☆28Jan 1, 2025Updated last year
- LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models☆25Aug 11, 2024Updated last year
- A neural speech codec based on discrete WavLM representations☆24Aug 28, 2024Updated last year
- Official page of "DeFTAN-II: Efficient multichannel speech enhancement with subgroup processing", IEEE/ACM Transactions on Audio, Speech,…☆31Nov 21, 2024Updated last year
- Universal differential equations for ecologists☆14Feb 4, 2026Updated last week
- This repository is the official implementation of unimodal aggregation (UMA) for automaticspeech recognition (ASR).☆36Dec 17, 2024Updated last year
- MSP-Podcast Challenge Baseline Code☆30Jun 12, 2024Updated last year
- Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…☆33Jun 14, 2024Updated last year
- SpeechJudge: Towards Human-Level Judgment for Speech Naturalness (https://arxiv.org/abs/2511.07931)☆56Dec 23, 2025Updated last month
- Balanced Error Rate for Speaker Diarization☆33Feb 28, 2023Updated 2 years ago
- ☆27Oct 25, 2024Updated last year
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆27Oct 10, 2023Updated 2 years ago
- Spherical residual vector quantization (SRVQ)☆31Aug 25, 2024Updated last year
- My vocoder experiments☆31Jul 26, 2025Updated 6 months ago
- [ICASSP2025] Official code for VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis☆52Apr 9, 2025Updated 10 months ago
- Speech samples and code of BEdit-TTS☆34Oct 8, 2023Updated 2 years ago
- Code for the Interspeech 2024 paper "MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting"☆45Jan 24, 2026Updated 3 weeks ago
- A large-scale speech corpus introduced in Spark-TTS, built from diverse open-source datasets for training text-to-speech (TTS) systems.☆105May 5, 2025Updated 9 months ago
- The program ranked first in Audio-only track of DCASE2024 Challenge task3.☆20Apr 12, 2025Updated 10 months ago
- A python implementation of “Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Mult…☆39Oct 11, 2024Updated last year
- WavReward: Spoken Dialogue Models With Generalist Reward Evaluators☆54May 15, 2025Updated 9 months ago
- Model configurations for scaling SE models in the paper "Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enha…☆36Aug 7, 2024Updated last year
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆113Jan 28, 2026Updated 2 weeks ago
- ☆39Oct 19, 2025Updated 3 months ago