unixpickle / vq-voice-swap
Voice swapping with VQ-VAE and diffusion models
☆67Updated 3 years ago
Alternatives and similar repositories for vq-voice-swap:
Users that are interested in vq-voice-swap are comparing it to the libraries listed below
- ☆82Updated last year
- A collection of pre-trained audio models, in PyTorch.☆112Updated 2 years ago
- Trainer for audio-diffusion-pytorch☆128Updated 2 years ago
- A novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently.☆194Updated last year
- ☆20Updated 3 years ago
- Implementation of NWT, audio-to-video generation, in Pytorch☆88Updated 2 years ago
- ☆66Updated last week
- Demo for 2022 ICASSP☆64Updated 2 years ago
- GOMIN; Gaudio Open Mel-spectrogram Inversion Network☆110Updated last year
- Majesty Diffusion by @Dango233 and @apolinario (@multimodalart)☆25Updated 2 years ago
- ☆22Updated 2 years ago
- ☆42Updated last month
- An implementation of simple diffusion in PyTorch (and JAX)☆35Updated 2 years ago
- CLOOB Conditioned Latent Diffusion training and inference code☆112Updated 2 years ago
- Upsampling Artifacts in Neural Audio Synthesis – https://arxiv.org/abs/2010.14356☆78Updated 3 years ago
- alchemy with embeddings☆34Updated last year
- Contrastive Language-Audio Pretraining☆15Updated 3 years ago
- Implementation of the framework described in the paper Spectrogram Inpainting for Interactive Generation of Instrument Sounds published a…☆38Updated 2 years ago
- Easily turn large sets of audio urls to an audio dataset.☆20Updated 2 years ago
- High-Resolution Image Synthesis with Latent Diffusion Models☆61Updated 2 years ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆82Updated 3 months ago
- The demo page of UniAudio☆34Updated 11 months ago
- TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.☆88Updated 3 years ago
- ☆56Updated 2 years ago
- JAX implementation ViT-VQGAN☆80Updated 2 years ago
- ☆62Updated 9 months ago
- Unified API to facilitate usage of pre-trained "perceptor" models, a la CLIP☆39Updated 2 years ago
- Code for Investigating Personalization Methods in Text to Music Generation☆36Updated 10 months ago
- Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch☆86Updated last year
- Audio Demo for "FastSVC: Fast Cross-Domain Singing Voice Conversion with Feature-wise Linear Modulation"☆19Updated 3 years ago