alanjeffares / discreteVAELinks
Code for our tutorial on Discrete Variational Autoencoders
☆32Updated 5 months ago
Alternatives and similar repositories for discreteVAE
Users that are interested in discreteVAE are comparing it to the libraries listed below
Sorting:
- Official code implementation for the work Preference Alignment with Flow Matching (NeurIPS 2024)☆61Updated last year
- Implementation of the dynamic chunking mechanism in H-net by Hwang et al. of Carnegie Mellon☆65Updated 3 months ago
- ☆151Updated 7 months ago
- ☆142Updated last year
- ☆42Updated last year
- Discrete Flow Matching implemented in PyTorch☆30Updated 7 months ago
- ☆31Updated 2 months ago
- Official Jax Implementation of MD4 Masked Diffusion Models☆139Updated 8 months ago
- Implementation of a multimodal diffusion transformer in Pytorch☆106Updated last year
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆134Updated last month
- Annotated Flow Matching paper☆218Updated last year
- Pytorch implementation of stochastically quantized variational autoencoder (SQ-VAE)☆192Updated 3 years ago
- Educational implementation of the Discrete Flow Matching paper☆125Updated last year
- Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto☆57Updated last year
- A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch☆114Updated 2 months ago
- [ICLR 2025 & COLM 2025] Official PyTorch implementation of the Forgetting Transformer and Adaptive Computation Pruning☆132Updated last week
- Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public☆91Updated 4 months ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆88Updated last year
- ☆108Updated 2 years ago
- A Pytorch Implementation of Finite Scalar Quantization☆164Updated last year
- Code for the paper: "Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods"☆27Updated 5 months ago
- Experimental playground for benchmarking language model (LM) architectures, layers, and tricks on smaller datasets. Designed for flexible…☆86Updated this week
- Flash Attention Triton kernel with support for second-order derivatives☆111Updated 3 weeks ago
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆165Updated 9 months ago
- [ICML 2025] Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction☆74Updated 5 months ago
- RND1: Scaling Diffusion Language Models☆163Updated 3 weeks ago
- ☆296Updated 10 months ago
- Implementation of Agent Attention in Pytorch☆91Updated last year
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆68Updated last year
- Speech2Vec Reality Check☆84Updated 2 years ago