eastonYi / wav2vec
a simplified version of wav2vec(1.0, vq, 2.0) in fairseq
☆132Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for wav2vec
- The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.☆144Updated 3 years ago
- ☆137Updated last year
- This repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data…☆128Updated 3 years ago
- Implementation of the paper "Spoken Language Recognition using X-vectors" in Pytorch☆105Updated 4 years ago
- A summary of speech data augment algorithms☆64Updated 3 years ago
- The official repository for Audio ALBERT☆64Updated 2 years ago
- Dataset and baseline code for the VocalSound dataset (ICASSP2022).☆123Updated 2 years ago
- ☆43Updated last year
- Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.☆201Updated 10 months ago
- PyTorch implementation of "Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss" (ICASS…☆101Updated 2 years ago
- This repo is to list the references papers of 《Speaker Recognition Based on Deep Learning: An Overview》☆37Updated 3 years ago
- Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (Interspeech 2022)☆112Updated 9 months ago
- [ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition☆211Updated last year
- Research code for the paper "Fine-tuning wav2vec2 for speaker recognition" found at https://arxiv.org/abs/2109.15053☆143Updated 2 years ago
- Official PyTorch implementation of Speaker Conditional WaveRNN☆109Updated 2 years ago
- Fre-GAN: Adversarial Frequency-consistent Audio Synthesis☆101Updated 3 years ago
- ☆109Updated 2 years ago
- This is the GitHub page for publicly available emotional speech data.☆322Updated 2 years ago
- ☆110Updated 2 years ago
- PyTorch implementation of Densely Connected Time Delay Neural Network☆85Updated last year
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆97Updated last year
- Implementation of "Duration Informed Attention Network for Multimodal Synthesis" paper in PyTorch.☆183Updated 4 years ago
- [INTERSPEECH'2022] Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning☆79Updated 2 years ago
- Official implementation of INTERSPEECH 2021 paper 'Emotion Recognition from Speech Using Wav2vec 2.0 Embeddings'☆126Updated 2 years ago
- ☆139Updated 4 months ago
- iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform☆227Updated last year
- Layer-wise analysis of self-supervised pre-trained speech representations☆97Updated last month
- This is the implementation of the paper "Emotion Intensity and its Control for Emotional Voice Conversion".☆81Updated 2 years ago
- 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition☆118Updated 2 years ago
- This repository contains the code for our upcoming paper An Investigation of End-to-End Models for Robust Speech Recognition at ICASSP 20…☆46Updated 3 years ago