dwgnr / speech-conversion
Whisper to Normal Speech Conversion with SC-MelGAN and SC-VQ-VAE
☆14Updated last year
Related projects ⓘ
Alternatives and complementary repositories for speech-conversion
- SandyPanda-MLDL / -Evaluation-Metrics-Used-For-The-Performance-Evaluation-of-Voice-Conversion-VC-ModelsEvaluation Metrics Used For The Performance Evaluation of Voice Conversion (VC) Models☆12Updated last year
- Code for Interspeech2022 paper DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion☆13Updated last year
- Towards Intelligibility-Oriented Audio-Visual Speech Enhancement☆13Updated 2 months ago
- ☆22Updated 3 years ago
- ☆26Updated last year
- Voice emotion conversion model for DS/ML master's thesis. F0 contour mapping in sequence-to-sequence RNN-LSTM architecture in Tensorflow.☆26Updated 6 years ago
- Nonparallel Emotional Speech Conversion with MUNIT. Introduction: This is a tensorflow implementation of paper(https://arxiv.org/pdf/1811…☆14Updated 3 years ago
- **ICASSP 2022** 《Toward Degradation-Robust Voice Conversion》Using speech enhancement and end-to-end denoising training to improve degrada…☆23Updated 2 years ago
- I-Vector Speaker recognition system implemented with MSRIT in matlab☆15Updated 8 years ago
- Implementation for paper: Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement☆20Updated 3 years ago
- Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…☆34Updated 11 months ago
- Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge in Online Conferencing Applications☆43Updated 2 years ago
- Pytorch implementation of "f0-consistent many-to-many non-parallel voice conversion via conditional autoencoder"☆28Updated 4 years ago
- The implementation of "X-TF-GridNet: A Time-Frequency Domain Target Speaker Extraction Network with Adaptive Speaker Embedding Fusion", w…☆36Updated last month
- unofficial implementation of "CPTNN: CROSS-PARALLEL TRANSFORMER NEURAL NETWORK FOR TIME-DOMAIN SPEECH ENHANCEMENT"☆15Updated last year
- The implementation of TaylorBeamformer, which is in submission to Interspeech2022☆40Updated 2 years ago
- Voice Alignment and Conversion with Neural Networks and the WORLD codec.☆20Updated 5 years ago
- ☆43Updated last year
- The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"☆51Updated last year
- This code is to run the WARP-Q speech quality metric.☆34Updated last month
- A python implementation of “Learning Deep Direct-Path Relative Transfer Function for Binaural Sound Source Localization” [TASLP 2021]☆23Updated last year
- ☆64Updated last year
- ☆48Updated last year
- The official repo: "McNet: Fuse Multiple Cues for Multichannel Speech Enhancement", ICASSP 2023☆108Updated last year
- This repository contains the audio samples and the source code that accompany the paper: "MixCycle: Unsupervised Speech Separation via Cy…☆23Updated last year
- Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction☆49Updated 2 weeks ago
- ☆48Updated 5 months ago
- wsj0-{2, 3, 4, 5} mix generation scripts, in Python.☆52Updated 3 years ago
- multi-channel target speech extraction with channel decorrelation and target speaker adaptation☆25Updated 3 years ago
- A probabilistic scoring backend for length-normalized embeddings.☆10Updated 6 months ago