praeclarumjj3 / VQ-VAE-on-MNIST
VQ-VAE implementation in Pytorch
☆24Updated 4 years ago
Alternatives and similar repositories for VQ-VAE-on-MNIST:
Users that are interested in VQ-VAE-on-MNIST are comparing it to the libraries listed below
- A minimal Pytorch Implementation of Stochastically Quantized Variational AutoEncoder (SQ-VAE) by Sony☆31Updated last year
- Official PyTorch implementation of "EdVAE: Mitigating Codebook Collapse with Evidential Discrete Variational Autoencoders"☆11Updated 7 months ago
- Inspired by "Neural Networks Fail to Learn Periodic Functions and How to Fix It"☆66Updated 11 months ago
- A collection of audio autoencoders, in PyTorch.☆40Updated 2 years ago
- ☆76Updated last week
- A home for audio ML in JAX. Has common features, learnable frontends, pretrained supervised and self-supervised models.☆68Updated 2 years ago
- A PyTorch implementation of Bayesian flow networks (Graves et al., 2023).☆25Updated last year
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆45Updated 6 months ago
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"☆57Updated last year
- ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation☆34Updated 5 months ago
- small audio language model for reasoning☆61Updated 3 weeks ago
- Implementation of "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs" in PyTorch☆18Updated 2 weeks ago
- A toolbox that provides hackable building blocks for generic 1D/2D/3D UNets, in PyTorch.☆85Updated last year
- LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …☆64Updated 4 months ago
- Official repo for DiscoDiff: Coarse-to-Fine Text-to-Music Latent Diffusion presented at ICASSP 2025☆12Updated last month
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆36Updated last year
- JAX Implementations of Descript Audio Codec and EnCodec☆26Updated last month
- PyTorch Implementation of [AudioLCM]: a efficient and high-quality text-to-audio generation with latent consistency model.☆11Updated 10 months ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Updated 10 months ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆86Updated 6 months ago
- Learning and controlling the source-filter representation of speech with a variational autoencoder☆45Updated 2 years ago
- Viterbi decoding in PyTorch☆32Updated last month
- (ICASSP 2025) Learning Source Disentanglement in Neural Audio Codec☆32Updated last week
- The official code for the SALMon🍣 benchmark (ICASSP 2025 - Oral)☆45Updated 3 weeks ago
- Implementation of a Light Recurrent Unit in Pytorch☆46Updated 7 months ago
- ☆24Updated 9 months ago
- ☆83Updated last year
- Accompanying code for our paper "Optimizing Short-Time Fourier Transform Parameters via Gradient Descent"☆33Updated 4 years ago
- Official implementation for FlowSep☆45Updated 4 months ago
- An official pytorch implementation of EACL2024 short paper "Flow Matching for Conditional Text Generation in a Few Sampling Steps"☆16Updated 11 months ago