kuleshov-group / remdm
Remasking Discrete Diffusion Models with Inference-Time Scaling
☆18Updated last month
Alternatives and similar repositories for remdm:
Users that are interested in remdm are comparing it to the libraries listed below
- Official Code Repository for the paper "Continuous Diffusion Model for Language Modeling".☆25Updated last month
- [ICLR 2025] Official PyTorch implementation of "Forgetting Transformer: Softmax Attention with a Forget Gate"☆95Updated 2 weeks ago
- Code for the paper: "Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods"☆19Updated last month
- ☆45Updated 11 months ago
- Official Code for Paper "Think While You Generate: Discrete Diffusion with Planned Denoising" [ICLR 2025]☆56Updated last month
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆48Updated last month
- Official code implementation for the work Preference Alignment with Flow Matching (NeurIPS 2024)☆49Updated 5 months ago
- Minimal Implementation of Visual Autoregressive Modelling (VAR)☆30Updated last month
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆34Updated 10 months ago
- Official Jax Implementation of MD4 Masked Diffusion Models☆77Updated last month
- Official Implementation of the paper: A Complete Recipe for Diffusion Generative Models☆30Updated 5 months ago
- Unofficial Implementation of Selective Attention Transformer☆16Updated 5 months ago
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch☆24Updated 3 months ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆65Updated 7 months ago
- Official code for the paper "Image generation with shortest path diffusion" accepted at ICML 2023.☆23Updated last year
- ☆17Updated 3 months ago
- Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto☆55Updated 11 months ago
- The codebase of our paper "Improving the Training of Rectified Flows", NeurIPS 2024☆108Updated 6 months ago
- The official repo of continuous speculative decoding☆24Updated 3 weeks ago
- A general framework for inference-time scaling and steering of diffusion models with arbitrary rewards.☆127Updated 2 months ago
- Code for GFlowNet-EM, a novel algorithm for fitting latent variable models with compositional latents and an intractable true posterior.☆40Updated last year
- HGRN2: Gated Linear RNNs with State Expansion☆54Updated 8 months ago
- ☆95Updated last year
- ☆32Updated last year
- Official code for the paper "Attention as a Hypernetwork"☆28Updated 10 months ago
- Official Implementation of Nabla-GFlowNet (ICLR 2025)☆19Updated 2 weeks ago
- JAX Scalify: end-to-end scaled arithmetics☆16Updated 5 months ago
- Code for the paper "Function-Space Learning Rates"☆19Updated last week
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆19Updated 4 months ago
- Simple Guidance Mechanisms for Discrete Diffusion Models☆31Updated 4 months ago