eastonYi / wav2vec
a simplified version of wav2vec(1.0, vq, 2.0) in fairseq
☆127Updated 4 years ago
Related projects: ⓘ
- This repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data…☆129Updated 2 years ago
- Implementation of "Duration Informed Attention Network for Multimodal Synthesis" paper in PyTorch.☆182Updated 4 years ago
- Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation (INTERSPEECH 2020)☆135Updated 2 years ago
- iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform☆221Updated last year
- The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.☆144Updated 3 years ago
- Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'☆122Updated 2 years ago
- Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.☆196Updated 8 months ago
- [ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition☆208Updated last year
- Official implementation of Meta-StyleSpeech and StyleSpeech☆238Updated 2 years ago
- Research code for the paper "Fine-tuning wav2vec2 for speaker recognition" found at https://arxiv.org/abs/2109.15053☆140Updated 2 years ago
- ☆130Updated 2 months ago
- PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASS…☆101Updated 2 years ago
- PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised T…☆181Updated last year
- This is the GitHub page for publicly available emotional speech data.☆314Updated 2 years ago
- This repo is to list the references papers of 《Speaker Recognition Based on Deep Learning: An Overview》☆35Updated 3 years ago
- Implementation of "Learning Latent Representations for Style Control and Transfer in End-to-end Speech Synthesis"☆166Updated last year
- ☆93Updated 2 years ago
- Wav2Keyword is keyword spotting(KWS) based on Wav2Vec 2.0. This model shows state-of-the-art in Speech commands dataset V1 and V2.☆98Updated last year
- An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-S…☆375Updated last year
- Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS☆159Updated 5 months ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆97Updated last year
- ☆160Updated 2 years ago
- Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning☆184Updated 4 years ago
- Fre-GAN: Adversarial Frequency-consistent Audio Synthesis☆101Updated 3 years ago
- A PyTorch implementation of End-to-End Neural Diarization☆98Updated last year
- Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllabl…☆157Updated 2 years ago
- Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.☆86Updated 2 years ago
- A Survey on Neural Speech Synthesis https://arxiv.org/pdf/2106.15561.pdf☆359Updated 2 years ago
- Kaldi-compatible online & offline feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd - P…☆186Updated last week
- ☆100Updated last year