scxue/AO-GPT-MDM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/scxue/AO-GPT-MDM)

scxue / AO-GPT-MDM

Any-Order GPT as Masked Diffusion Model: Decoupling Formulation and Architecture. Training an MDM using GPT with this repo!

☆36

Alternatives and similar repositories for AO-GPT-MDM

Users that are interested in AO-GPT-MDM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ML-GSAI / SMDM
View on GitHub
Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"
☆385Dec 22, 2024Updated last year
kuleshov-group / d2
View on GitHub
d2: Improved Techinques for Training Reasonoing Diffusion Language Models
☆16Mar 25, 2026Updated 4 months ago
tyshiwo1 / Awesome-Visual-Tokenizer
View on GitHub
Awesome Visual Tokenizers/Autoencoders
☆20Nov 19, 2025Updated 8 months ago
ML-GSAI / RADD
View on GitHub
Official PyTorch implementation for "Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data" (ICLR…
☆84May 30, 2025Updated last year
kuleshov-group / e2d2
View on GitHub
[NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference
☆47Oct 29, 2025Updated 9 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
HKUNLP / DiffuLLaMA
View on GitHub
[ICLR2025] DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models
☆401May 31, 2025Updated last year
ysy-phoenix / evalhub
View on GitHub
All-in-one benchmarking platform for evaluating LLM.
☆15Nov 12, 2025Updated 8 months ago
giannisdaras / ambient-omni
View on GitHub
[NeurIPS 2025, Spotlight]: Ambient-o: Training Good models with Bad Data.
☆35Jul 2, 2026Updated 3 weeks ago
guolinke / SphereAR
View on GitHub
Implementation of "Hyperspherical Latents Improve Continuous-Token Autoregressive Generation"
☆104Feb 28, 2026Updated 5 months ago
SJTU-DENG-Lab / Discrete-Diffusion-Forcing
View on GitHub
Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference
☆261Feb 3, 2026Updated 5 months ago
BIGKnight / Understanding-Training-free-Diffusion-Guidance
View on GitHub
☆19Mar 18, 2024Updated 2 years ago
JetAstra / SDAR
View on GitHub
SDAR (Synergy of Diffusion and AutoRegression), a large diffusion language model（1.7B, 4B, 8B, 30B）
☆364Jun 2, 2026Updated last month
gabeguo / any-order-speculative-decoding
View on GitHub
Reviving Any-Order Autoregressive Models via Principled Parallel Sampling and Speculative Decoding
☆16Nov 16, 2025Updated 8 months ago
ML-GSAI / BFN-Solver
View on GitHub
Official PyTorch implementation for "Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations"
☆41Apr 23, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hxixixh / gumbel-distill
View on GitHub
Official implementation of Gumbel Distillation for Parallel Text Generation
☆21Mar 24, 2026Updated 4 months ago
tyuxie / RFM
View on GitHub
The official codebase for Reflected Flow Matching (ICML 2024)
☆24Jun 19, 2024Updated 2 years ago
mingyuanzhou / SiD-LSG
View on GitHub
Score identity Distillation with Long and Short Guidance for One-Step Text-to-Image Generation
☆96Dec 4, 2025Updated 7 months ago
shreyaspadhy / CMCD
View on GitHub
Implementing Controlled Monte Carlo Diffusions (ICLR 2024)
☆18Sep 30, 2024Updated last year
wmn-231314 / diffusion-data-constraint
View on GitHub
Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…
☆127Jan 10, 2026Updated 6 months ago
JinjieNi / dlms-are-super-data-learners
View on GitHub
The official github repo for "Diffusion Language Models are Super Data Learners".
☆227Nov 6, 2025Updated 8 months ago
Gen-Verse / dLLM-RL
View on GitHub
[ICLR 2026] Official code for TraceRL: Revolutionizing post-training for Diffusion LLMs, powering the SOTA TraDo series.
☆511Jan 28, 2026Updated 6 months ago
scxue / advantage_weighted_matching
View on GitHub
Official code for paper Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models
☆93Apr 23, 2026Updated 3 months ago
kuleshov-group / remdm
View on GitHub
Remasking Discrete Diffusion Models with Inference-Time Scaling
☆77Feb 7, 2026Updated 5 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
brianlck / FlexMDM
View on GitHub
☆55Sep 10, 2025Updated 10 months ago
s-sahoo / Eso-LMs
View on GitHub
[ICML 2026] Esoteric Language Models
☆122Jul 13, 2026Updated 2 weeks ago
Francesco215 / text-diffusion
View on GitHub
Generates text with diffusion models. Reproduction of the Continous Diffusion for Categorical Data paper by Deepmind
☆18Dec 9, 2024Updated last year
hng / BiomolecularStructures.jl
View on GitHub
The BiomolecularStructures package provides several Bioinformatics-related modules for Julia.
☆12Jun 13, 2017Updated 9 years ago
kuleshov-group / setdlms
View on GitHub
[ICML 2026] Set Diffusion: Interpolating Token Orderings between Autoregression and Diffusion for Fast and Flexible Decoding
☆22Jul 20, 2026Updated last week
cychomatica / FreeDave
View on GitHub
Free Draft-and-Verification: Toward Lossless Parallel Decoding for Diffusion Large Language Models
☆23May 19, 2026Updated 2 months ago
kuleshov-group / bd3lms
View on GitHub
[ICLR 2025 Oral] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
☆1,025Jul 10, 2025Updated last year
LituRout / ADLM
View on GitHub
Anchored Diffusion Language Model (NeurIPS 2025)
☆30Oct 13, 2025Updated 9 months ago
Mosasaur5526 / BCM-iCT-torch
View on GitHub
Official implementation of our paper "Bidirectional Consistency Models"; and reproduced Improved Consistency Models (iCT).
☆27May 10, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kuleshov-group / discrete-diffusion-guidance
View on GitHub
Simple Guidance Mechanisms for Discrete Diffusion Models
☆92Dec 16, 2024Updated last year
horseee / dKV-Cache
View on GitHub
[NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models
☆135May 22, 2025Updated last year
ML-GSAI / LLaDA-1.5
View on GitHub
☆55Apr 14, 2026Updated 3 months ago
czg1225 / dParallel
View on GitHub
[ICLR 2026] dParallel: Learnable Parallel Decoding for dLLMs
☆65Apr 12, 2026Updated 3 months ago
alibaba-damo-academy / Lingshu_MCP
View on GitHub
wrap lingshu as an MCP tool
☆18Sep 16, 2025Updated 10 months ago
ChenWu98 / algorithmic-creativity
View on GitHub
[ICML 2025] Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
☆89May 26, 2025Updated last year
bansky-cl / Diffusion-LM-Papers
View on GitHub
Listing some diffusion papers in NLP domain I have read, text generation is main, table will continue to be updated.
☆79Mar 24, 2025Updated last year