JinjieNi/MegaDLMs

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/JinjieNi/MegaDLMs)

JinjieNi / MegaDLMs

GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 training.

☆327

Alternatives and similar repositories for MegaDLMs

Users that are interested in MegaDLMs are comparing it to the libraries listed below

Sorting:

maomaocun / dLLM-Var
View on GitHub
The official implementation of dLLM-Var
☆31Nov 6, 2025Updated 4 months ago
JinjieNi / dlms-are-super-data-learners
View on GitHub
The official github repo for "Diffusion Language Models are Super Data Learners".
☆223Nov 6, 2025Updated 4 months ago
zhangyitonggg / dllm4code
View on GitHub
Offical implementation of our paper "Exploring the Potential of Diffusion Large Language Models in Code Generation".
☆20Oct 29, 2025Updated 4 months ago
pengzhangzhi / Open-dLLM
View on GitHub
Open diffusion language model for code generation — releasing pretraining, evaluation, inference, and checkpoints.
☆536Updated this week
JinjieNi / Quokka
View on GitHub
The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models sca…
☆45Nov 6, 2025Updated 4 months ago
isaacus-dev / mleb
View on GitHub
The code used to evaluate embedding models on the Massive Legal Embedding Benchmark (MLEB).
☆31Feb 24, 2026Updated last week
SJTU-DENG-Lab / Discrete-Diffusion-Forcing
View on GitHub
Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference
☆241Feb 3, 2026Updated last month
horseee / dKV-Cache
View on GitHub
[NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models
☆130May 22, 2025Updated 9 months ago
sail-sg / odc
View on GitHub
On demand communication
☆32Feb 26, 2026Updated last week
yegcjs / DiffusionLLM
View on GitHub
Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"
☆84Jan 24, 2024Updated 2 years ago
NVlabs / Fast-dLLM
View on GitHub
Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"
☆865Jan 28, 2026Updated last month
DocTron-hub / OCRVerse
View on GitHub
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models
☆29Feb 4, 2026Updated last month
LARK-AI-Lab / CodeScaler
View on GitHub
The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"
☆29Feb 23, 2026Updated last week
Gen-Verse / dLLM-RL
View on GitHub
[ICLR 2026] Official code for TraceRL: Revolutionizing post-training for Diffusion LLMs, powering the SOTA TraDo series.
☆435Jan 28, 2026Updated last month
sail-sg / feedback-conditional-policy
View on GitHub
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
☆59Jan 5, 2026Updated 2 months ago
ZFTurbo / DrivenData-Identify-Fish-Challenge-2nd-Place-Solution
View on GitHub
Solution for N+1 fish, N+2 fish DrivenData competition (2nd place)
☆13Sep 12, 2019Updated 6 years ago
kuleshov-group / e2d2
View on GitHub
[NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference
☆36Oct 29, 2025Updated 4 months ago
w-yibo / R1-Compress
View on GitHub
[NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search
☆17Jan 24, 2026Updated last month
maple-research-lab / LLaDOU
View on GitHub
Implementation of "Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models" [NeurIPS 2025]
☆74Dec 17, 2025Updated 2 months ago
Tomiinek / Aargh
View on GitHub
☆12Jan 2, 2024Updated 2 years ago
STARE-bench / STARE
View on GitHub
☆16Oct 12, 2025Updated 4 months ago
XavierJiezou / Face-MoGLE
View on GitHub
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation
☆28Dec 10, 2025Updated 2 months ago
NikolaiT / free-proxy-list
View on GitHub
List of free and checked http, https, socks4 and socks5 proxies
☆17Updated this week
DreamLM / DreamOn
View on GitHub
Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas
☆103Feb 3, 2026Updated last month
sail-sg / SimLayerKV
View on GitHub
The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.
☆51Oct 18, 2024Updated last year
VILA-Lab / Awesome-DLMs
View on GitHub
The official GitHub repo for the survey paper "A Survey on Diffusion Language Models".
☆818Feb 3, 2026Updated last month
ziplab / CoV
View on GitHub
CoV: Chain-of-View Prompting for Spatial Reasoning
☆51Jan 23, 2026Updated last month
FreedomIntelligence / EchoX
View on GitHub
EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs
☆46Sep 19, 2025Updated 5 months ago
OpenMOSS / VehicleWorld
View on GitHub
VehicleWorld is the first comprehensive multi-device environment for intelligent vehicle interaction that accurately models the complex, …
☆21Sep 16, 2025Updated 5 months ago
psg-mit / nightjarpy
View on GitHub
Python library to add support for embedding natural code in Python with shared program state.
☆24Jan 20, 2026Updated last month
hengyuan-hu / jax-vs-pytorch
View on GitHub
☆13Feb 25, 2025Updated last year
OpenLMLab / ParallelTokenizer
View on GitHub
Use the tokenizer in parallel to achieve superior acceleration
☆20Mar 21, 2024Updated last year
OpenMOSS / DiRL
View on GitHub
☆149Feb 25, 2026Updated last week
VizuaraAILabs / truly-open-gpt-oss
View on GitHub
A truly open version of gpt-oss which shows the entire pre-training from scratch
☆86Sep 4, 2025Updated 6 months ago
Alpha-VLLM / Lumina-DiMOO
View on GitHub
Lumina-DiMOO - An Open-Sourced Multi-Modal Large Diffusion Language Model
☆941Dec 27, 2025Updated 2 months ago
andrew-cr / tauLDR
View on GitHub
Code for the paper https://arxiv.org/abs/2205.14987v2
☆58Apr 18, 2024Updated last year
gjq100 / Graph-Counselor
View on GitHub
☆28Jun 5, 2025Updated 9 months ago
kyegomez / PaLM2-VAdapter
View on GitHub
Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…
☆17Nov 11, 2024Updated last year
CURRENTF / LowRankClone
View on GitHub
[NeurIPS 2025 Spotlight] A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone.
☆45Oct 29, 2025Updated 4 months ago