nanless / universal-speech-enhancement
Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation, clipping, equalization (EQ) distortion, packet loss, codec loss, bandwidth limitations, and other forms of degradation.
☆23Updated last month
Related projects: ⓘ
- Spherical residual vector quantization (SRVQ)☆26Updated 3 weeks ago
- Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction☆32Updated 5 months ago
- real-time speech enhance☆11Updated 7 months ago
- ☆42Updated last year
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆34Updated 8 months ago
- Transformer with Local Modeling by Convolution for Speech Separation and Enhancement☆26Updated last month
- Pytorch Models for Speech Enhancement☆15Updated last year
- HiFTNet wav/audio super-resolution 16/24 kHz to 48 kHz☆21Updated 8 months ago
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆21Updated last year
- ☆57Updated last year
- Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"☆13Updated last year
- ☆24Updated last week
- An evaluation set for large-scale trained TTS models (Coming in Sep 2024)☆10Updated 2 weeks ago
- 60k hours of phoneme-aligned audio from audio books☆18Updated last month
- Differentiable Mean Opinion Score Regularization for Perceptual Speech Enhancement☆22Updated last year
- ☆9Updated 2 years ago
- An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".☆22Updated 10 months ago
- ConMamba for Automatic Speech Recognition☆38Updated last month
- The source code for the paper CrossSinger (asru2023)☆18Updated 11 months ago
- Please visit https://thuhcsi.github.io/SnakeGAN/☆36Updated last year
- Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutiv…☆39Updated 6 months ago
- ☆16Updated 2 months ago
- We design a spectral compression mapping (SCM) for full-band speech enhancement, and propose a two-stage stream named MHA-DPCRN☆20Updated 2 years ago
- Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"☆12Updated 2 years ago
- The implementation of MDNet, which is in submission to Interspeech2022☆12Updated 2 years ago
- ☆35Updated 4 months ago
- ☆13Updated 2 months ago
- PitchVC: Pitch Conditioned Any-to-Many Voice Conversion☆33Updated 3 months ago
- TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings☆15Updated last month
- Source code and demo for INTERSPEECH 2024 paper: Noise-robust Speech Separation with Fast Generative Correction☆25Updated this week