unixpickle / vq-voice-swap
Voice swapping with VQ-VAE and diffusion models
☆66Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for vq-voice-swap
- ☆81Updated last year
- Implementation of NWT, audio-to-video generation, in Pytorch☆87Updated 2 years ago
- Trainer for audio-diffusion-pytorch☆127Updated last year
- ☆22Updated 2 years ago
- ☆21Updated 3 years ago
- A novel diffusion-based model for synthesizing long-context, high-fidelity music efficiently.☆194Updated last year
- Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch☆86Updated last year
- ☆64Updated last month
- Unofficial implementation of Neural Analysis and Synthesis☆7Updated 2 years ago
- 🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch☆50Updated last year
- A collection of pre-trained audio models, in PyTorch.☆111Updated last year
- ☆31Updated 2 years ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆82Updated last month
- Easily turn large sets of audio urls to an audio dataset.☆20Updated last year
- Majesty Diffusion by @Dango233 and @apolinario (@multimodalart)☆25Updated 2 years ago
- CLOOB Conditioned Latent Diffusion training and inference code☆112Updated 2 years ago
- ☆28Updated 2 years ago
- Unified API to facilitate usage of pre-trained "perceptor" models, a la CLIP☆39Updated last year
- Upsampling Artifacts in Neural Audio Synthesis – https://arxiv.org/abs/2010.14356☆76Updated 3 years ago
- Contrastive Language-Audio Pretraining☆15Updated 3 years ago
- Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch☆89Updated 2 years ago
- An implementation of simple diffusion in PyTorch (and JAX)☆35Updated last year
- High-Resolution Image Synthesis with Latent Diffusion Models☆62Updated 2 years ago
- text-to-audio-latent-diffusion☆35Updated last year
- Implementation of the framework described in the paper Spectrogram Inpainting for Interactive Generation of Instrument Sounds published a…☆37Updated 2 years ago
- Training simple models to predict CLIP image embeddings from text embeddings, and vice versa.☆59Updated 2 years ago
- ☆14Updated 2 years ago
- ☆56Updated 2 years ago
- ☆64Updated 3 years ago