unixpickle / vq-voice-swap
Voice swapping with VQ-VAE and diffusion models
☆66Updated 3 years ago
Alternatives and similar repositories for vq-voice-swap:
Users that are interested in vq-voice-swap are comparing it to the libraries listed below
- ☆84Updated last year
- Demo for 2022 ICASSP☆64Updated 2 years ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆86Updated 6 months ago
- Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch☆87Updated 2 years ago
- Trainer for audio-diffusion-pytorch☆129Updated 2 years ago
- A collection of pre-trained audio models, in PyTorch.☆113Updated 2 years ago
- The demo page of UniAudio☆33Updated last year
- A novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently.☆196Updated last year
- Code for Unconditional Audio Generation with GAN and Cycle Regularization☆75Updated 3 years ago
- Contrastive Language-Audio Pretraining☆15Updated 3 years ago
- ☆22Updated 2 years ago
- Implementation of NWT, audio-to-video generation, in Pytorch☆90Updated 3 years ago
- ☆40Updated 5 months ago
- Upsampling Artifacts in Neural Audio Synthesis – https://arxiv.org/abs/2010.14356☆78Updated 4 years ago
- TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.☆88Updated 3 years ago
- ☆20Updated 3 years ago
- ☆66Updated 3 weeks ago
- ☆65Updated last year
- GOMIN; Gaudio Open Mel-spectrogram Inversion Network☆110Updated last year
- A toolbox that provides hackable building blocks for generic 1D/2D/3D UNets, in PyTorch.☆85Updated last year
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆113Updated 2 years ago
- An High-resolution implementation of HiFi-GAN Vocoder for Voice Conversion.☆31Updated 2 years ago
- ☆23Updated last year
- CLOOB Conditioned Latent Diffusion training and inference code☆113Updated 3 years ago
- Unofficial implementation of Neural Analysis and Synthesis☆7Updated 3 years ago
- ☆28Updated 3 years ago
- Implementation of the framework described in the paper Spectrogram Inpainting for Interactive Generation of Instrument Sounds published a…☆40Updated 2 years ago
- 🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch☆50Updated 2 years ago
- text-to-audio-latent-diffusion☆37Updated last year
- PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis☆69Updated 3 years ago