wilsonchingg / logmmse
LogMMSE speech enhancement/noise reduction
☆30Updated 4 years ago
Alternatives and similar repositories for logmmse:
Users that are interested in logmmse are comparing it to the libraries listed below
- Implementation of "FastSpeech: Fast, Robust and Controllable Text to Speech"☆64Updated last year
- an tutorial implement of voice conversion using pytorch☆35Updated 6 years ago
- Sequence-to-sequence TTS based on Kyubyong's dc_tts☆60Updated last year
- Contains code for our work on speech to singing conversion (ICASSP 2020)☆50Updated 4 years ago
- Interspeech 2019 tutorial materials☆48Updated 5 years ago
- Phoneme Boundary Detection using Learnable Segmental Features (ICASSP 2020)☆79Updated 3 years ago
- Integration of Fastspeech Text to Mel generation and fast Vocoder Squeezewave☆20Updated last year
- A Pytorch Implementation of MelGAN☆67Updated 5 years ago
- ☆40Updated 2 years ago
- Pytorch Implementation of FFTNet☆86Updated 6 years ago
- Official PyTorch implementation of Speaker Conditional WaveRNN☆109Updated 2 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆100Updated last year
- Data processing tools for preparing speech and labels for training TTS voices☆24Updated 4 years ago
- A "Crowd-Built" continuously growing speech dataset with transcripts. The dataset contains multiple languages and is intended for anyone …☆41Updated 2 years ago
- ICASSP 2020 ESPnet-TTS: Merlin baseline system☆36Updated 5 years ago
- BERT and LSTM baseline models of the ZeroSpeech Challenge 2021☆57Updated 2 years ago
- Feature extractor for DL speech processing.☆65Updated 2 years ago
- Code for Speaker Change Detection in Broadcast TV using Bidirectional Long Short-Term Memory Networks☆63Updated 4 years ago
- A fast cnn-based vocoder☆78Updated 4 years ago
- Robust Speech Activity Detection (SAD) in movie audio☆26Updated 3 years ago
- ☆10Updated 5 years ago
- This is the implementation of our Interspeech 2020 paper "Converting anyone's emotion: towards speaker-independent emotional voice conver…☆89Updated 4 years ago
- A lightweight library to compute Diarization Error Rate (DER).☆59Updated last year
- Adapt Kaldi-ASR nnet3 chain models from Zamia-Speech.org to a different language model☆34Updated 4 years ago
- Util code, issues, discussions☆28Updated 6 years ago
- A better, faster, stronger version of the unbounded interleaved-state recurrent neural network (UIS-RNN)☆59Updated 4 years ago
- RawNet: Fast End-to-End Neural Vocoder☆42Updated 5 years ago
- ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for…☆43Updated 4 years ago
- Compute useful transcriptions metrics (CER, WER, SER, ...)☆26Updated 10 years ago