kuleshov-group / mdlm
Simplified Masked Diffusion Language Model
โ251Updated last month
Alternatives and similar repositories for mdlm:
Users that are interested in mdlm are comparing it to the libraries listed below
- [ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)โ454Updated 10 months ago
- A curated list for awesome discrete diffusion models resources.โ198Updated this week
- A MAD laboratory to improve AI architecture designs ๐งชโ102Updated last month
- Minimal Implementation of a D3PM in pytorchโ192Updated 8 months ago
- โ168Updated last year
- A simple implimentation of Bayesian Flow Networks (BFN)โ240Updated last year
- Educational implementation of the Discrete Flow Matching paperโ71Updated 4 months ago
- โ60Updated 3 weeks ago
- Understand and test language model architectures on synthetic tasks.โ175Updated this week
- โ78Updated last year
- Some preliminary explorations of Mamba's context scaling.โ206Updated 11 months ago
- Code for https://arxiv.org/abs/2406.04329โ51Updated last month
- Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAIโ270Updated 2 months ago
- DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Modelsโ71Updated last month
- Normalized Transformer (nGPT)โ145Updated last month
- Implementation of ๐ฅฅ Coconut, Chain of Continuous Thought, in Pytorchโ145Updated 2 weeks ago
- Muon optimizer for neural networks: >30% extra sample efficiency, <3% wallclock overheadโ210Updated last week
- Implementation of rectified flow and some of its followup research / improvements in Pytorchโ231Updated this week
- Implementation of a multimodal diffusion transformer in Pytorchโ99Updated 6 months ago
- โ116Updated 10 months ago
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"โ219Updated last month
- [ICML 2023] Reflected Diffusion Models (https://arxiv.org/abs/2304.04740)โ157Updated last year
- Code for Paper "Think While You Generate: Discrete Diffusion with Planned Denoising"โ35Updated 2 months ago
- Annotated Flow Matching paperโ157Updated 4 months ago
- Reparameterized Discrete Diffusion Models for Text Generationโ92Updated last year
- โ180Updated this week
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ90Updated last month
- This is the official code release for Bayesian Flow Networks.