fal-ai-community/nano-mdm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/fal-ai-community/nano-mdm)

fal-ai-community / nano-mdm

Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun

☆57

Alternatives and similar repositories for nano-mdm

Users that are interested in nano-mdm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fal-ai / diffusion-speedrun
View on GitHub
Focused on fast experimentation and simplicity
☆77Dec 24, 2024Updated last year
cloneofsimo / min-max-in-dit
View on GitHub
☆27May 3, 2024Updated 2 years ago
fal-ai-community / minDDPD
View on GitHub
☆33Jan 6, 2025Updated last year
bentherien / mu_learned_optimization
View on GitHub
[Poster; ICLR 2026] [Oral; Neurips OPT2024] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers
☆16Apr 15, 2026Updated 3 months ago
cloneofsimo / d3pm
View on GitHub
Minimal Implementation of a D3PM in pytorch
☆310Apr 22, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
cloneofsimo / karras-power-ema-tutorial
View on GitHub
☆53Jan 6, 2024Updated 2 years ago
amaarora / melanoma_wandb
View on GitHub
☆16Mar 1, 2022Updated 4 years ago
cloneofsimo / min-max-gpt
View on GitHub
Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training
☆132Apr 17, 2024Updated 2 years ago
jeongukjae / smaller-labse
View on GitHub
Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE
☆20Sep 22, 2021Updated 4 years ago
fal-ai-community / llmdifftracker
View on GitHub
Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)
☆32Feb 27, 2025Updated last year
cloneofsimo / project_RF
View on GitHub
☆24Jun 4, 2024Updated 2 years ago
igul222 / plaid
View on GitHub
☆115May 29, 2023Updated 3 years ago
cloneofsimo / repa-rf
View on GitHub
☆32Nov 4, 2024Updated last year
cloneofsimo / zeroshampoo
View on GitHub
☆33Sep 10, 2024Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
edwardmilsom / function-space-learning-rates-paper
View on GitHub
Code for the paper "Function-Space Learning Rates"
☆23Jun 3, 2025Updated last year
lzzcd001 / nabla-gfn
View on GitHub
Official Implementation of Nabla-GFlowNet (ICLR 2025)
☆28May 3, 2025Updated last year
maomaocun / dLLM-cache
View on GitHub
Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…
☆212May 1, 2026Updated 2 months ago
drarijitdas / Natural-GaLore
View on GitHub
An extention to the GaLore paper, to perform Natural Gradient Descent in low rank subspace
☆19Oct 21, 2024Updated last year
edenartlab / flux-trainer
View on GitHub
Eden Flux LoRA trainer and full-finetuning
☆23Mar 21, 2025Updated last year
lumalabs / imm
View on GitHub
Official implementation of Inductive Moment Matching
☆585Jul 11, 2025Updated last year
HazyResearch / embroid
View on GitHub
Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification
☆11Aug 12, 2023Updated 2 years ago
duerig / StyleTTS2
View on GitHub
StyleTTS 2 Optimized Training Fork
☆32Feb 2, 2025Updated last year
cloneofsimo / scaling-guide
View on GitHub
WIP
☆96Aug 13, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
bloc97 / DeMo
View on GitHub
DeMo: Decoupled Momentum Optimization
☆202Dec 2, 2024Updated last year
SilentView / EMCID
View on GitHub
Official Implementation for "Editing Massive Concepts in Text-to-Image Diffusion Models"
☆19Mar 21, 2024Updated 2 years ago
kuleshov-group / remdm
View on GitHub
Remasking Discrete Diffusion Models with Inference-Time Scaling
☆77Feb 7, 2026Updated 5 months ago
LINs-lab / GMem
View on GitHub
[Preprint] GMem: A Modular Approach for Ultra-Efficient Generative Models
☆43Mar 11, 2025Updated last year
cloneofsimo / ezmup
View on GitHub
Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam
☆88Jul 28, 2024Updated 2 years ago
cloneofsimo / min-fsdp
View on GitHub
☆93Jul 5, 2024Updated 2 years ago
NathanGodey / qfilters
View on GitHub
Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)
☆34Mar 7, 2025Updated last year
shreyansh26 / An-Empirical-Model-of-Large-Batch-Training
View on GitHub
An approximate implementation of the OpenAI paper - An Empirical Model of Large-Batch Training for MNIST
☆11Nov 19, 2022Updated 3 years ago
INFINIQ-AI1 / CLIPVQDiffusion
View on GitHub
official implementation of "CLIP-VQDiffusion : Langauge Free Training of Text To Image generation using CLIP and vector quantized diffusi…
☆19Sep 5, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
kuleshov-group / mdlm
View on GitHub
[NeurIPS 2024] Simple and Effective Masked Diffusion Language Model
☆705Sep 29, 2025Updated 10 months ago
HKUNLP / diffusion-of-thoughts
View on GitHub
[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"
☆213Mar 4, 2025Updated last year
OpenMOSS / Sparse-dLLM
View on GitHub
☆29Oct 16, 2025Updated 9 months ago
SwayStar123 / microdiffusion
View on GitHub
☆49Feb 23, 2025Updated last year
sanchit-gandhi / whisper-flash-attention
View on GitHub
☆21Mar 7, 2023Updated 3 years ago
Nicolas-BZRD / llm-distillation
View on GitHub
☆11Feb 3, 2025Updated last year
jdeschena / sdtt
View on GitHub
[ICLR 2025] SDTT: a simple and effective distillation method for discrete diffusion models
☆52Feb 26, 2026Updated 5 months ago