sony / sqvae
Pytorch implementation of stochastically quantized variational autoencoder (SQ-VAE)
☆181Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for sqvae
- ☆118Updated 8 months ago
- [Neurips 2021]Diffusion Normalizing Flow (DiffFlow)☆117Updated last year
- [ICCV 2023] Online Clustered Codebook☆148Updated 2 months ago
- Contrastively Disentangled Sequential Variational Audoencoder☆45Updated last month
- A Pytorch Implementation of Finite Scalar Quantization☆88Updated 11 months ago
- PyTorch implementation of slicing adversarial network (SAN)☆90Updated 5 months ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆82Updated last month
- ☆299Updated 2 years ago
- ☆33Updated 10 months ago
- BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis☆223Updated 2 years ago
- Official PyTorch implementation of the paper: Flow Matching in Latent Space☆209Updated 3 weeks ago
- PyTorch implementation of diffusion models.☆58Updated 3 years ago
- Official Implementation for "Consistency Flow Matching: Defining Straight Flows with Velocity Consistency"☆152Updated 4 months ago
- [NeurIPS 2023] Official Implementation: "Consistent Diffusion Models"☆54Updated last year
- ☆29Updated last year
- This repo contains the official PyTorch implementation of AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image …☆77Updated 5 months ago
- Vector-Quantized Contrastive Predictive Coding for Acoustic Unit Discovery and Voice Conversion☆142Updated 4 years ago
- PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.☆121Updated 2 years ago
- [Official Implementation] Acoustic Autoregressive Modeling 🔥☆57Updated 3 months ago
- Official code for "Maximum Likelihood Training for Score-Based Diffusion ODEs by High-Order Denoising Score Matching" (ICML 2022)☆53Updated 2 years ago
- Official code for "Maximum Likelihood Training of Score-Based Diffusion Models", NeurIPS 2021 (spotlight)☆134Updated 2 years ago
- Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations☆89Updated 5 months ago
- A PyTorch implementation of "Continuous Relaxation Training of Discrete Latent Variable Image Models"☆72Updated 4 years ago
- A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application☆221Updated 7 months ago
- Code for the paper Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models (ICLR 2022 Outsta…☆169Updated 2 years ago
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer☆83Updated 2 years ago
- Implementation of Bit Diffusion, Hinton's group's attempt at discrete denoising diffusion, in Pytorch☆333Updated last year
- ☆239Updated last month
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆110Updated last year
- This package aims at simplifying the download of the AudioCaps dataset.☆30Updated 11 months ago