madhavlab / 2022_syncnet
SyncNet for Time Synchronization
☆23Updated 2 years ago
Alternatives and similar repositories for 2022_syncnet:
Users that are interested in 2022_syncnet are comparing it to the libraries listed below
- Voice conversion model for real-time speech synthesis using PPG (Phonetic PosteriorGram) as an intermediate feature, written in Pytorch.☆28Updated 3 years ago
- Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models☆42Updated this week
- ☆65Updated last year
- Ultrafast GAN based Vocoder for Text to Speech☆50Updated 2 years ago
- Python implementation of the paper " Dynamic Temporal Alignment of Speech to Lips"☆32Updated 5 years ago
- The official PyTorch implementation of "Inter-SubNet: Speech Enhancement with Subband Interaction", accepted by ICASSP 2023.☆95Updated last year
- [Interspeech 2024] Hold Me Tight: Stable Encoder-Decoder Design for Speech Enhancement☆37Updated 4 months ago
- The implementation of paper "SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody"☆32Updated last year
- Glow-TTS with Stochastic Duration Predictor and Stochastic Pitch Predictor☆18Updated last year
- ☆13Updated last year
- [TOMM 2024] Automatic Lyric Transcription and Automatic Music Transcription from Multimodal Singing☆21Updated 7 months ago
- Repo for source code of EBEN: Extreme Bandwidth Extension Network☆73Updated 2 months ago
- ☆10Updated 2 years ago
- Towards Intelligibility-Oriented Audio-Visual Speech Enhancement☆14Updated 7 months ago
- This repo contains conv-tasnet for basis-melgan. If you want to get code of basis-melgan, please refer to FastVocoder.☆20Updated 3 years ago
- The official PyTorch implementation of paper: An Improved StarGAN for Emotional Voice Conversion: Enhancing Voice Quality and Data Augmen…☆9Updated 3 years ago
- (TASLP 2022) Unsupervised speech enhancement using DVAEs☆21Updated 3 months ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆60Updated 8 months ago
- Computes the Mel-Cepstral Distance of two WAV files based on the paper "Mel-Cepstral Distance Measure for Objective Speech Quality Assess…☆52Updated 4 months ago
- PyTorch implementation of LiMuSE☆30Updated 2 years ago
- ☆20Updated 5 months ago
- This is the implementation our Interspeech 2022 paper " Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conv…☆18Updated last year
- Learning and controlling the source-filter representation of speech with a variational autoencoder☆45Updated last year
- Official repository for the paper Multimodal Transformer Distillation for Audio-Visual Synchronization (ICASSP 2024).☆25Updated last year
- STOI loss function in PyTorch☆91Updated 6 months ago
- An implementation for Frame-level Speech Signal-to-Noise Ratio Estimation using deep learning☆38Updated 3 years ago
- [ICASSP 2025] FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion☆61Updated 2 months ago
- VAE modified from Descript Audio Codec, which replaces the RVQ with VAE☆69Updated last year
- Speech enhancement in noisy and reverberant environments using deep neural networks☆19Updated 2 weeks ago
- Streaming Audiotransformers for online Audio tagging☆43Updated 10 months ago