dr-pato / SSGDLinks
Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"
☆14Updated 3 years ago
Alternatives and similar repositories for SSGD
Users that are interested in SSGD are comparing it to the libraries listed below
Sorting:
- Spherical residual vector quantization (SRVQ)☆31Updated last year
- ☆66Updated 2 years ago
- Official PyTorch inference code for the Interspeech 2025 paper: Efficient Speech Enhancement via Embeddings from Pre-trained Generative A…☆74Updated 7 months ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆75Updated last year
- ☆54Updated 2 years ago
- A toolkit for researchers in the multimodal sound separation.☆16Updated 2 years ago
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Updated 2 years ago
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆23Updated 2 years ago
- Streaming Audiotransformers for online Audio tagging☆50Updated last year
- A toolkit dedicate for speech evaluation.☆24Updated last year
- ☆49Updated 9 months ago
- The implementation of MDNet, which is in submission to Interspeech2022☆14Updated 3 years ago
- Pytorch implementation of LearnableUpsamplingLayer (NaturalSpeech, Tan et al., 2022)☆57Updated last year
- ☆16Updated last year
- Dynamic Mixing For Speech Processing (mix-on-the-fly)☆21Updated 3 years ago
- PyTorch implementation of Continuous Speech Separation☆12Updated 3 years ago
- A simple implementation for improving CosyVoice2 by GRPO method☆31Updated 3 months ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Updated last year
- This is the implementation of the manuscript "Learning General All-Neural Speech Enhancement based on Taylor's Approximation Theory", whi…☆14Updated 3 years ago
- Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models☆55Updated 9 months ago
- Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation☆23Updated 2 months ago
- Real-Time ASR with CNN-BiLSTM: End-to-End Live Streaming Using PyTorch Lightning⚡☆11Updated last year
- ☆24Updated 2 years ago
- faster inference☆28Updated last year
- ☆52Updated last year
- The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"☆52Updated 2 years ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Updated 2 years ago
- The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…☆42Updated 4 months ago
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆36Updated 2 years ago
- (WIP)long form speech generatoins☆31Updated 9 months ago